Error handling

When running the database management queries, such as CREATE DATABASE, it is possible to encounter errors.

Observing errors

Because database management operations are performed asynchronously, these errors may not be returned immediately upon query execution. Instead, you must monitor the output from the SHOW DATABASE command; particularly the statusMessage and currentStatus columns.

Example 1. Fail to create a database
neo4j@system> CREATE DATABASE foo;
0 rows available after 108 ms, consumed after another 0 ms
neo4j@system> SHOW DATABASE foo;

In standalone mode:

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| name   | type       | aliases | access       | address          | role      | writer | requestedStatus | currentStatus | statusMessage             | default | home  | constituents |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "foo"  | "standard" | []      | "read-write" | "localhost:7687" | "primary" | TRUE   | "online"        | "dirty"       | "File system permissions" | FALSE   | FALSE | []           |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 rows available after 4 ms, consumed after another 1 ms

In a cluster:

+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| name   | type       | aliases | access       | address          | role      | writer | requestedStatus | currentStatus | statusMessage             | default | home  | constituents |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "foo"  | "standard" | []      | "read-write" | "localhost:7687" | "primary" | TRUE   | "online"        | "online"      | ""                        | FALSE   | FALSE | []           |
| "foo"  | "standard" | []      | "read-write" | "localhost:7688" | "primary" | FALSE  | "online"        | "online"      | ""                        | FALSE   | FALSE | []           |
| "foo"  | "standard" | []      | "read-write" | "localhost:7689" | "primary" | FALSE  | "online"        | "dirty"       | "File system permissions" | FALSE   | FALSE | []           |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

3 row available after 100 ms, consumed after another 6 ms

Database states

A database management operation may fail for a number of reasons. For example, if the file system instance has incorrect permissions, or Neo4j itself is misconfigured. As a result, the contents of the statusMessage column in the SHOW DATABASE query results may vary significantly.

However, databases may only be in one of a select number of states:

  • online

  • offline

  • starting

  • stopping

  • store copying

  • initial

  • deallocating

  • dirty

  • quarantined

  • unknown

For more details about the various states, see Database states. Most often, when a database management operation fails, Neo4j attempts to transition the database in question to the offline state. If the system is certain that no store files have yet been created, it transitions the database to initial instead. Similarly, if the system suspects that the store files underlying the database are invalid (incomplete, partially deleted, or corrupt), then it transitions the database to dirty.

Retrying failed operations

Database management operations may be safely retried in the event of failure. However, these retries are not guaranteed to succeed, and errors may persist through several attempts.

If a database is in the quarantined state, retrying the last operation will not work.

Example 2. Retry to start a database
neo4j@system> START DATABASE foo;
0 rows available after 108 ms, consumed after another 0 ms
neo4j@system> SHOW DATABASE foo;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| name   | type       | aliases | access       | address          | role      | writer | requestedStatus | currentStatus | statusMessage             | default | home  | constituents |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "foo"  | "standard" | []      | "read-write" | "localhost:7687" | "primary" | TRUE   | "online"        | "offline"     | "File system permissions" | FALSE   | FALSE | []           |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 rows available after 4 ms, consumed after another 1 ms

After investigating and addressing the underlying issue, you can start the database again and verify that it is running properly:

neo4j@system> START DATABASE foo;
0 rows available after 108 ms, consumed after another 0 ms
neo4j@system> SHOW DATABASE foo;
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| name     | type       | aliases | access       | address          | role      | writer | requestedStatus | currentStatus | statusMessage | default | home  | constituents |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "foo"    | "standard" | []      | "read-write" | "localhost:7687" | "primary" | TRUE   | "online"        | "online"      | ""            | FALSE   | FALSE | []           |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

1 rows available after 4 ms, consumed after another 1 ms

If repeated retries of a command have no effect, or if a database is in a dirty state, you may drop and recreate the database, as detailed in Create database.

When running DROP DATABASE as part of an error handling operation, you can also append DUMP DATA to the command. It produces a database dump that can be further examined and potentially repaired.

Quarantined databases

When a database encounters a severe error during its normal run, which prevents it from a further operation, Neo4j stops that database and brings it into a quarantined state. Meaning, it is not possible to restart it with a simple START DATABASE command. You have to run CALL dbms.unquarantineDatabase(server, database, operation) to lift the quarantine, specifying as server the instance with the failing database.

The dbms.unquarantineDatabase() procedure is introduced in Neo4j 2025.01 to replace the now-deprecated dbms.quarantineDatabase().

After lifting the quarantine, the instance will automatically try to bring the database to the desired state.

Syntax:

CALL dbms.unquarantineDatabase(server, database, operation)

Input arguments:

Name Type Description

server

String

The identifier of the server where the quarantine for database will be lifted.

database

String

The name of the database that will be put into or removed from quarantine.

operation

String

Optional operation to apply while lifting the quarantine.

The possible values for the optional operation are:

  • keepStateKeepStore — do nothing; leave store and cluster state as they are.

  • replaceStateKeepStore — join as a new member, clearing the current cluster state but keeping the store.

  • replaceStateReplaceStore — join as a new member, clearing both the current cluster state and the store.

If you choose to clear the current cluster state, the server will try to join as a new member, but this joining can succeed if and only if there is a majority of old members "letting" the new members in. Let’s assume our cluster has a topology with three primaries. If there is only one server in QUARANTINED mode, then it is safe to choose replaceStateKeepStore or replaceStateReplaceStore. If there are two servers in QUARANTINED mode, then you should not use concurrently replaceStateKeepStore or replaceStateReplaceStore for both servers because there would be no majority to let them in.

Return arguments:

The procedure doesn’t return any value.

Check if a database is quarantined
neo4j@system> SHOW DATABASE foo;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| name  | type       | aliases | access       | address          | role      | writer | requestedStatus | currentStatus | statusMessage                                           | default | home  | constituents |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "foo" | "standard" | []      | "read-write" | "localhost:7688" | "unknown" | FALSE  | "online"        | "quarantined" | "By neo4j at 2020-10-15T15:10:41.348Z: No reason given" | FALSE   | FALSE | []           |
| "foo" | "standard" | []      | "read-write" | "localhost:7689" | "primary" | FALSE  | "online"        | "online"      | ""                                                      | FALSE   | FALSE | []           |
| "foo" | "standard" | []      | "read-write" | "localhost:7687" | "primary" | TRUE   | "online"        | "online"      | ""                                                      | FALSE   | FALSE | []           |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

3 row available after 100 ms, consumed after another 6 ms

A quarantined state is persisted for user databases. This means that if a database is quarantined, it will remain so even if that Neo4j instance is restarted. You can remove it only by running the dbms.unquarantineDatabase() procedure.

The one exception to this rule is for the built-in system database. Any quarantine for that database is removed automatically after instance restart.