Coordinate parallel transactions
When working with a Neo4j cluster, causal consistency is enforced by default in most cases, which guarantees that a query is able to read changes made by previous queries. The same does not happen by default for multiple transactions running in parallel though. In that case, you can use bookmarks to have one transaction wait for the result of another to be propagated across the cluster before running its own work. This is not a requirement, and you should only use bookmarks if you need casual consistency across different transactions, as waiting for bookmarks can have a negative performance impact.
A bookmark is a token that represents some state of the database. By passing one or multiple bookmarks along with a query, the server will make sure that the query does not get executed before the represented state(s) have been established.
Bookmarks with .executeQuery()
When querying the database with .executeQuery()
, the driver manages bookmarks for you.
In this case, you have the guarantee that subsequent queries can read previous changes without taking further action.
await driver.executeQuery('<QUERY 1>')
// subsequent executeQuery calls will be causally chained
await driver.executeQuery('<QUERY 2>') // can read result of <QUERY 1>
await driver.executeQuery('<QUERY 3>') // can read result of <QUERY 2>
To disable bookmark management and causal consistency, set the bookmarkManager
option to null
in .executeQuery()
calls.
await driver.executeQuery(
'<QUERY>',
{},
{
bookmarkManager: null
}
)
Bookmarks within a single session
Bookmark management happens automatically for queries run within a single session, so that you can trust that queries inside one session are causally chained.
let session = driver.session({database: 'neo4j'})
try {
await session.executeWrite(async tx => {
await tx.run("<QUERY 1>")
})
await session.executeWrite(async tx => {
await tx.run("<QUERY 2>") // can read result of QUERY 1
})
await session.executeWrite(async tx => {
await tx.run("<QUERY 3>") // can read result of QUERY 1, 2
})
} finally {
await session.close()
}
Bookmarks across multiple sessions
If your application uses multiple sessions, you may need to ensure that one session has completed all its transactions before another session is allowed to run its queries.
In the example below, sessionA
and sessionB
are allowed to run concurrently, while sessionC
waits until their results have been propagated.
This guarantees the Person
nodes sessionC
wants to act on actually exist.
const neo4j = require('neo4j-driver');
(async () => {
const URI = '<URI to Neo4j database>'
const USER = '<Username>'
const PASSWORD = '<Password>'
let driver
try {
driver = neo4j.driver(URI, neo4j.auth.basic(USER, PASSWORD))
await driver.verifyConnectivity()
} catch(err) {
console.log(`-- Connection error --\n${err}\n-- Cause --\n${err.cause}`)
return
}
await createFriends(driver)
})()
async function createFriends(driver) {
let savedBookmarks = [] // To collect the sessions' bookmarks
// Create the first person and employment relationship.
const sessionA = driver.session({database: 'neo4j'})
try {
await createPerson(sessionA, 'Alice')
await employPerson(sessionA, 'Alice', 'Wayne Enterprises')
savedBookmarks.concat(sessionA.lastBookmarks()) (1)
} finally {
sessionA.close()
}
// Create the second person and employment relationship.
const sessionB = driver.session({database: 'neo4j'})
try {
await createPerson(sessionB, 'Bob')
await employPerson(sessionB, 'Bob', 'LexCorp')
savedBookmarks.concat(sessionB.lastBookmarks()) (1)
} finally {
sessionB.close()
}
// Create (and show) a friendship between the two people created above.
const sessionC = driver.session({
database: 'neo4j',
bookmarks: savedBookmarks (2)
})
try {
await createFriendship(sessionC, 'Alice', 'Bob')
await printFriendships(sessionC)
} finally {
sessionC.close()
}
}
// Create a person node.
async function createPerson(session, name) {
await session.executeWrite(async tx => {
await tx.run('CREATE (:Person {name: $name})', { name: name })
})
}
// Create an employment relationship to a pre-existing company node.
// This relies on the person first having been created.
async function employPerson(session, personName, companyName) {
await session.executeWrite(async tx => {
await tx.run(`
MATCH (person:Person {name: $personName})
MATCH (company:Company {name: $companyName})
CREATE (person)-[:WORKS_FOR]->(company)`,
{ personName: personName, companyName: companyName }
)
})
}
// Create a friendship between two people.
async function createFriendship(session, nameA, nameB) {
await session.executeWrite(async tx => {
await tx.run(`
MATCH (a:Person {name: $nameA})
MATCH (b:Person {name: $nameB})
MERGE (a)-[:KNOWS]->(b)
`, { nameA: nameA, nameB: nameB }
)
})
}
// Retrieve and display all friendships.
async function printFriendships(session) {
const result = await session.executeRead(async tx => {
return await tx.run('MATCH (a)-[:KNOWS]->(b) RETURN a.name, b.name')
})
for(record of result.records) {
console.log(`${record.get('a.name')} knows ${record.get('b.name')}`)
}
}
1 | Collect and combine bookmarks from different sessions using the method Session.lastBookmarks() . |
2 | Use them to initialize another session with the bookmarks parameter. |
The use of bookmarks can negatively impact performance, since all queries are forced to wait for the latest changes to be propagated across the cluster. For simple use-cases, try to group queries within a single transaction, or within a single session. |
Mix .executeQuery()
and sessions
To ensure causal consistency among transactions executed partly with .executeQuery()
and partly with sessions, you can use the option bookmarkManager
upon session creation, setting it to driver.executeQueryBookmarkManager
.
Since that is the default bookmark manager for .executeQuery()
calls, this will ensure that all work is executed under the same bookmark manager and thus causally consistent.
await driver.executeQuery('<QUERY 1>')
session = driver.session({
bookmarkManager: driver.executeQueryBookmarkManager
})
try {
// every query inside this session will be causally chained
// (i.e., can read what was written by <QUERY 1>)
await session.executeWrite(async tx => tx.run('<QUERY 2>'))
} finally {
await session.close()
}
// subsequent executeQuery calls will be causally chained
// (i.e., can read what was written by <QUERY 2>)
await driver.executeQuery('<QUERY 3>')
Glossary
- LTS
-
A Long Term Support release is one guaranteed to be supported for a number of years. Neo4j 4.4 is LTS, and Neo4j 5 will also have an LTS version.
- Aura
-
Aura is Neo4j’s fully managed cloud service. It comes with both free and paid plans.
- Cypher
-
Cypher is Neo4j’s graph query language that lets you retrieve data from the database. It is like SQL, but for graphs.
- APOC
-
Awesome Procedures On Cypher (APOC) is a library of (many) functions that can not be easily expressed in Cypher itself.
- Bolt
-
Bolt is the protocol used for interaction between Neo4j instances and drivers. It listens on port 7687 by default.
- ACID
-
Atomicity, Consistency, Isolation, Durability (ACID) are properties guaranteeing that database transactions are processed reliably. An ACID-compliant DBMS ensures that the data in the database remains accurate and consistent despite failures.
- eventual consistency
-
A database is eventually consistent if it provides the guarantee that all cluster members will, at some point in time, store the latest version of the data.
- causal consistency
-
A database is causally consistent if read and write queries are seen by every member of the cluster in the same order. This is stronger than eventual consistency.
- NULL
-
The null marker is not a type but a placeholder for absence of value. For more information, see Cypher → Working with
null
. - transaction
-
A transaction is a unit of work that is either committed in its entirety or rolled back on failure. An example is a bank transfer: it involves multiple steps, but they must all succeed or be reverted, to avoid money being subtracted from one account but not added to the other.
- backpressure
-
Backpressure is a force opposing the flow of data. It ensures that the client is not being overwhelmed by data faster than it can handle.