CALL subqueries in transactions
CALL
subqueries can be made to execute in separate, inner transactions, producing intermediate commits.
This can be useful when doing large write operations, like batch updates, imports, and deletes.
To execute a CALL
subquery in separate transactions, add the modifier IN TRANSACTIONS
after the subquery.
An outer transaction is opened to report back the accumulated statistics for the inner transactions
(created and deleted nodes, relationships, etc.) and it will succeed or fail depending on the results
of those inner transactions.
By default, inner transactions group together batches of 1000 rows.
Cancelling the outer transaction will cancel the inner ones as well.
CALL { … } IN TRANSACTIONS is only allowed in implicit transactions.If you are using Neo4j Browser, you must prepend any queries using CALL { … } IN TRANSACTIONS with :auto .
|
The examples on this page use a variable scope clause (introduced in Neo4j 5.23) to import variables into the CALL subquery.
If you are using an older version of Neo4j, use an importing WITH clause instead.
|
Syntax
CALL {
subQuery
} IN [[concurrency] CONCURRENT] TRANSACTIONS
[OF batchSize ROW[S]]
[REPORT STATUS AS statusVar]
[ON ERROR {CONTINUE | BREAK | FAIL}];
Loading CSV data
This example uses a CSV file and the LOAD CSV
clause to import data into the database.
It creates nodes in separate transactions using CALL { … } IN TRANSACTIONS
:
1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24
LOAD CSV FROM 'file:///friends.csv' AS line
CALL (line) {
CREATE (:Person {name: line[1], age: toInteger(line[2])})
} IN TRANSACTIONS
|
Rows: 0 |
As the size of the CSV file in this example is small, only a single separate transaction is started and committed.
Deleting a large volume of data
Using CALL { … } IN TRANSACTIONS
is the recommended way of deleting a large volume of data.
MATCH (n)
CALL (n) {
DETACH DELETE n
} IN TRANSACTIONS
|
Rows: 0 |
The CALL { … } IN TRANSACTIONS
subquery should not be modified.
Any necessary filtering can be done before the subquery.
MATCH (n:Label) WHERE n.prop > 100
CALL (n) {
DETACH DELETE n
} IN TRANSACTIONS
|
Rows: 0 |
The batching is performed on the input rows fed into |
Batching
The amount of work to do in each separate transaction can be specified in terms of how many input rows
to process before committing the current transaction and starting a new one.
The number of input rows is set with the modifier OF n ROWS
(or OF n ROW
).
If omitted, the default batch size is 1000
rows.
The number of rows can be expressed using any expression that evaluates to a positive integer and does not refer to nodes or relationships.
This example loads a CSV file with one transaction for every 2
input rows:
1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24
LOAD CSV FROM 'file:///friends.csv' AS line
CALL (line) {
CREATE (:Person {name: line[1], age: toInteger(line[2])})
} IN TRANSACTIONS OF 2 ROWS
|
Rows: 0 |
The query now starts and commits three separate transactions:
-
The first two executions of the subquery (for the first two input rows from
LOAD CSV
) take place in the first transaction. -
The first transaction is then committed before proceeding.
-
The next two executions of the subquery (for the next two input rows) take place in a second transaction.
-
The second transaction is committed.
-
The last execution of the subquery (for the last input row) takes place in a third transaction.
-
The third transaction is committed.
You can also use CALL { … } IN TRANSACTIONS OF n ROWS
to delete all your data in batches in order to avoid a huge garbage collection or an OutOfMemory
exception.
For example:
MATCH (n)
CALL (n) {
DETACH DELETE n
} IN TRANSACTIONS OF 2 ROWS
|
Rows: 0 |
Up to a point, using a larger batch size will be more performant.
The batch size of |
Composite databases
As of Neo4j 5.18, CALL { … } IN TRANSACTIONS
can be used with composite databases.
Even though composite databases allow accessing multiple graphs in a single query, only one graph can be modified in a single transaction.
CALL { … } IN TRANSACTIONS
offers a way of constructing queries which modify multiple graphs.
While the previous examples are generally valid for composite databases, there’s a few extra factors that come into play when working with composite databases in subqueries.
The following examples show how you can use CALL { … } IN TRANSACTIONS
on a composite database.
1,Bill,26
2,Max,27
3,Anna,22
4,Gladys,29
5,Summer,24
Person
nodes on all constituents, drawing data from friends.csvUNWIND graph.names() AS graphName
LOAD CSV FROM 'file:///friends.csv' AS line
CALL (*) {
USE graph.byName( graphName )
CREATE (:Person {name: line[1], age: toInteger(line[2])})
} IN TRANSACTIONS
UNWIND graph.names() AS graphName
CALL {
USE graph.byName( graphName )
MATCH (n)
RETURN elementId(n) AS id
}
CALL {
USE graph.byName( graphName )
WITH id
MATCH (n)
WHERE elementId(n) = id
DETACH DELETE n
} IN TRANSACTIONS
Since the batching is performed on the input rows fed into CALL { … } IN TRANSACTIONS , the data must be supplied from outside the subquery in order for the batching to have an effect.
That is why the nodes are matched in a subquery preceding the one that actually deletes the data.
If the MATCH clause were inside the second subquery, the data deletion would run as one single transaction.
|
There is currently a known issue.
When an error occurs during |
Batch size in composite databases
Because CALL { … } IN TRANSACTIONS
subqueries targeting different graphs can’t be interleaved, if a USE
clause evaluates to a different target than the current one, the current batch is committed and the next batch is created.
The batch size declared with IN TRANSACTIONS OF … ROWS
represents an upper limit of the batch size, but the real batch size depends on how many input rows target one database in sequence.
Every time the target database changes, the batch is committed.
IN TRANSACTIONS OF ROWS
on composite databasesThe next example assumes the existence of two constituents remoteGraph1
and remoteGraph2
for the composite database composite
.
While the declared batch size is 3, only the first 2 rows act on composite.remoteGraph1
, so the batch size for the first transaction is 2. That is followed by 3 rows on composite.remoteGraph2
, 1 on composite.remoteGraph2
and finally 2 on composite.remoteGraph1
.
WITH ['composite.remoteGraph1', 'composite.remoteGraph2'] AS graphs
UNWIND [0, 0, 1, 1, 1, 1, 0, 0] AS i
WITH graphs[i] AS g
CALL (g) {
USE graph.byName( g )
CREATE ()
} IN TRANSACTIONS OF 3 ROWS
Error behavior
Users can choose one of three different option flags to control the behavior
in case of an error occurring in any of the inner transactions of CALL { … } IN TRANSACTIONS
:
-
ON ERROR CONTINUE
to ignore a recoverable error and continue the execution of subsequent inner transactions. The outer transaction succeeds. It will cause the expected variables from the failed inner query to be bound as null for that specific transaction. -
ON ERROR BREAK
to ignore a recoverable error and stop the execution of subsequent inner transactions. The outer transaction succeeds. It will cause expected variables from the failed inner query to be bound as null for all onward transactions (including the failed one). -
ON ERROR FAIL
to acknowledge a recoverable error and stop the execution of subsequent inner transactions. The outer transaction fails. This is the default behavior if no flag is explicitly specified.
On error, any previously committed inner transactions remain committed, and are not rolled back. Any failed inner transactions are rolled back. |
In the following example, the last subquery execution in the second inner transaction fails due to division by zero.
UNWIND [4, 2, 1, 0] AS i
CALL (i) {
CREATE (:Person {num: 100/i})
} IN TRANSACTIONS OF 2 ROWS
RETURN i
/ by zero (Transactions committed: 1)
When the failure occurred, the first transaction had already been committed, so the database contains two example nodes.
MATCH (e:Person)
RETURN e.num
e.num |
---|
|
|
Rows: 2 |
In the following example, ON ERROR CONTINUE
is used after a failed inner transaction to execute the remaining inner transactions and not fail the outer transaction:
UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR CONTINUE
RETURN n.num;
n.num |
---|
|
|
|
|
Rows: 4 |
Note the difference in results when batching in transactions of 2 rows:
UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 2 ROWS
ON ERROR CONTINUE
RETURN n.num;
n.num |
---|
|
|
|
|
Rows: 4 |
This happens because an inner transaction with the two first i
elements (1 and 0)
was created, and it fails for 0.
This causes it to be rolled back and the return
variable is filled with nulls for those two elements.
In the following example, ON ERROR BREAK
is used after a failed inner transaction to not execute the remaining inner transaction and not fail the outer transaction:
UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR BREAK
RETURN n.num;
n.num |
---|
|
|
|
|
Rows: 4 |
Note the difference in results when batching in transactions of 2 rows:
UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 2 ROWS
ON ERROR BREAK
RETURN n.num;
n.num |
---|
|
|
|
|
Rows: 4 |
In the following example, ON ERROR FAIL
is used after the failed inner transaction, to not execute the remaining inner transactions and to fail the outer transaction:
UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR FAIL
RETURN n.num;
/ by zero (Transactions committed: 1)
Status report
Users can also report the execution status of the inner transactions by using REPORT STATUS AS var
.
This flag is disallowed for ON ERROR FAIL
. For more information, see Error behavior.
After each execution of the inner query finishes (successfully or not), a status value is created that records information about the execution and the transaction that executed it:
-
If the inner execution produces one or more rows as output, then a binding to this status value is added to each row, under the selected variable name.
-
If the inner execution fails then a single row is produced containing a binding to this status value under the selected variable, and null bindings for all variables that should have been returned by the inner query (if any).
The status value is a map value with the following fields:
-
started
:true
when the inner transaction was started,false
otherwise. -
committed
,true
when the inner transaction changes were successfully committed,false
otherwise. -
transactionId
: the inner transaction id, ornull
if the transaction was not started. -
errorMessage
, the inner transaction error message, ornull
in case of no error.
Example of reporting status with ON ERROR CONTINUE
:
UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR CONTINUE
REPORT STATUS AS s
RETURN n.num, s;
n.num | s |
---|---|
|
|
|
|
|
|
|
|
Rows: 4 |
Example of reporting status with ON ERROR BREAK
:
UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR BREAK
REPORT STATUS AS s
RETURN n.num, s.started, s.committed, s.errorMessage;
n.num | s.started | s.committed | s.errorMessage |
---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rows: 4 |
Reporting status with ON ERROR FAIL
is disallowed:
UNWIND [1, 0, 2, 4] AS i
CALL (i) {
CREATE (n:Person {num: 100/i}) // Note, fails when i = 0
RETURN n
} IN TRANSACTIONS
OF 1 ROW
ON ERROR FAIL
REPORT STATUS AS s
RETURN n.num, s.errorMessage;
REPORT STATUS can only be used when specifying ON ERROR CONTINUE or ON ERROR BREAK
Concurrent transactions
By default, CALL { … } IN TRANSACTIONS
is single-threaded; one CPU core is used to sequentially execute batches.
However, CALL
subqueries can also execute batches in parallel by appending IN [n] CONCURRENT TRANSACTIONS
, where n
is a concurrency value used to set the maximum number of transactions that can be executed in parallel.
This allows CALL
subqueries to utilize multiple CPU cores simultaneously, which can significantly reduce the time required to execute a large, outer transaction.
The concurrency value is optional. If not specified, a default value based on the amount of available CPU cores will be chosen. If a negative number is specified (which can only be done through a parameter), the concurrency will be the number of available CPU cores reduced by the absolute value of that number. |
CALL { … } IN CONCURRENT TRANSACTIONS
is particularly suitable for importing data without dependencies.
This example creates Person
nodes from a unique tmdbId
value assigned to each person row in the CSV file (444 in total) in 3 concurrent transactions.
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/persons.csv' AS row
CALL (row) {
CREATE (p:Person {tmdbId: row.person_tmdbId})
SET p.name = row.name, p.born = row.born
} IN 3 CONCURRENT TRANSACTIONS OF 10 ROWS
RETURN count(*) AS personNodes
personNodes |
---|
|
Rows: 1 |
Concurrency and non-deterministic results
CALL { … } IN TRANSACTIONS
uses ordered semantics by default, where batches are committed in a sequential row-by-row order.
For example, in CALL { <I> } IN TRANSACTIONS
, any writes done in the execution of <I1>
must be observed by <I2>
, and so on.
In contrast, CALL { … } IN CONCURRENT TRANSACTIONS
uses concurrent semantics, where both the number of rows committed by a particular batch and the order of committed batches is undefined.
That is, in CALL { <I> } IN CONCURRENT TRANSACTIONS
, writes committed in the execution of <I1>
may or may not be observed by <I2>
, and so on.
The results of CALL
subqueries executed in concurrent transactions may, therefore, not be deterministic.
To guarantee deterministic results, ensure that the results of committed batches are not dependent on each other.
Using CALL { … } IN CONCURRENT TRANSACTIONS can impact error behavior.
Specifically, when using ON ERROR BREAK or ON ERROR FAIL and one transaction fails, then any concurrent transactions may not be interrupted and rolled back (though subsequent ones would).
This is because no timing guarantees can be given for concurrent transactions.
That is, an ongoing transaction may or may not commit successfully in the time window when the error is being handled.
Use the status report to determine which batches were committed and which failed or did not start.
|
Deadlocks
When a write transaction occurs, Neo4j takes locks to preserve data consistency while updating. For example, when creating or deleting a relationship, a write lock is taken on both the specific relationship and its connected nodes.
A deadlock happens when two transactions are blocked by each other because they are attempting to concurrently modify a node or a relationship that is locked by the other transaction (for more information about locks and deadlocks in Neo4j, see Operations Manual → Concurrent data access.
A deadlock may occur when using CALL { … } IN CONCURRENT TRANSACTIONS
if the transactions for two or more batches try to take the same locks in an order that results in a circular dependency between them.
If so, the impacted transactions are always rolled back, and an error is thrown unless the query is appended with ON ERROR CONTINUE
or ON ERROR BREAK
.
The following query tries to create Movie
and Year
nodes connected by a RELEASED_IN
relationship.
Note that there are only three different years in the CSV file, meaning that only three Year
nodes should be created.
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row
CALL (row) {
MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS
The deadlock occurs because the two transactions are simultaneously trying to lock and merge the same Year
.
ForsetiClient[transactionId=64, clientId=12] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=63, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(98) because holders of that lock are waiting for ForsetiClient[transactionId=64, clientId=12].
Wait list:ExclusiveLock[
Client[63] waits for [ForsetiClient[transactionId=64, clientId=12]]]
The following query uses ON ERROR CONTINUE
to bypass any deadlocks and continue with the execution of subsequent inner transactions.
It returns the transactionID
, commitStatus
and errorMessage
of the failed transactions.
ON ERROR CONTINUE
to ignore deadlocks and complete outer transactionLOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row
CALL (row) {
MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS ON ERROR CONTINUE REPORT STATUS as status
WITH status
WHERE status.errorMessage IS NOT NULL
RETURN status.transactionId AS transaction, status.committed AS commitStatus, status.errorMessage AS errorMessage
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| transaction | commitStatus | errorMessage |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
| "neo4j-transaction-169" | FALSE | "ForsetiClient[transactionId=169, clientId=11] can't acquire ExclusiveLock{owner=ForsetiClient[transactionId=168, clientId=9]} on NODE_RELATIONSHIP_GROUP_DELETE(46) because holders of that lock are waiting for ForsetiClient[transactionId=169, clientId=11]. |
| | \ Wait list:ExclusiveLock[ |
| | \ Client[168] waits for [ForsetiClient[transactionId=169, clientId=11]]]" |
+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Click to see an example of failed transactions being retried using Cypher
While failed transactions may be more efficiently retried using a driver, below is an example how failed transactions can be retried within the same Cypher® query:
LOAD CSV WITH HEADERS FROM 'https://data.neo4j.com/importing-cypher/movies.csv' AS row
CALL (row) {
MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS ON ERROR CONTINUE REPORT STATUS as status
WITH *
WHERE status.committed = false
CALL (row) {
MERGE (m:Movie {movieId: row.movieId})
MERGE (y:Year {year: row.year})
MERGE (m)-[r:RELEASED_IN]->(y)
} IN 2 CONCURRENT TRANSACTIONS OF 10 ROWS ON ERROR FAIL
Restrictions
These are the restrictions on queries that use CALL { … } IN TRANSACTIONS
:
-
A nested
CALL { … } IN TRANSACTIONS
inside aCALL { … }
clause is not supported. -
A
CALL { … } IN TRANSACTIONS
in aUNION
is not supported. -
A
CALL { … } IN TRANSACTIONS
after a write clause is not supported, unless that write clause is inside aCALL { … } IN TRANSACTIONS
.