Further query mechanisms

Implicit (or auto-commit) transactions

This is the most basic and limited form with which to run a Cypher query. The driver will not automatically retry implicit transactions, as it does instead for queries run with execute_query() and with managed transactions. Implicit transactions should only be used when the other driver query interfaces do not fit the purpose, or for quick prototyping.

You run an implicit transaction with the method Session.run(). It returns a Result object that needs to be processed accordingly.

with driver.session(database="neo4j") as session:
    session.run("CREATE (a:Person {name: $name})", name="Licia")

An implicit transaction gets committed at the latest when the session is destroyed, or before another transaction is executed within the same session. Other than that, there is no clear guarantee on when exactly an implicit transaction will be committed during the lifetime of a session. To ensure an implicit transaction is committed, you can call the .consume() method on its result.

Since the driver cannot figure out whether the query in a session.run() call requires a read or write session with the database, it defaults to write. If your implicit transaction contains read queries only, there is a performance gain in making the driver aware by setting the keyword argument default_access_mode=neo4j.READ_ACCESS when creating the session.

Implicit transactions are the only ones that can be used for CALL { …​ } IN TRANSACTIONS queries.

Import CSV files

The most common use case for using Session.run() is for importing large CSV files into the database with the LOAD CSV Cypher clause, and preventing timeout errors due to the size of the transaction.

Import CSV data into a Neo4j database
with driver.session(database="neo4j") as session:
    result = session.run("""
        LOAD CSV FROM 'https://data.neo4j.com/bands/artists.csv' AS line
        CALL {
            WITH line
            MERGE (:Artist {name: line[1], age: toInteger(line[2])})
        } IN TRANSACTIONS OF 2 ROWS
    """)
    print(result.consume().counters)
While LOAD CSV can be a convenience, there is nothing wrong in deferring the parsing of the CSV file to your Python application and avoiding LOAD CSV. In fact, moving the parsing logic to the application can give you more control over the importing process. For efficient bulk data insertion, see Performance → Batch data creation.

For more information, see Cypher → Clauses → Load CSV.

Transaction configuration

The Query object allows to specify a query timeout and to attach metadata to the transaction. The metadata is visible in the server logs (as described for the unit_of_work decorator).

from neo4j import Query

with driver.session(database="neo4j") as session:
    query = Query("CREATE (a:Person {name: $name})",
                  timeout=1.0,
                  metadata={"app_name": "people"})
    result = session.run(query, name="John")

Dynamic values in property keys, relationship types, and labels

In general, you should not concatenate parameters directly into a query, but rather use query parameters. There can however be circumstances where your query structure prevents the usage of parameters in all its parts. In fact, although parameters can be used for literals and expressions as well as node and relationship ids, they cannot be used for the following constructs:

  • property keys, so MATCH (n) WHERE n.$param = 'something' is invalid;

  • relationship types, so MATCH (n)-[:$param]→(m) is invalid;

  • labels, so MATCH (n:$param) is invalid.

For those queries, you are forced to use string concatenation. To protect against link:Cypher injections, you should enclose the dynamic values in backticks and escape them yourself. Notice that Cypher processes Unicode, so take care of the Unicode literal \u0060 as well.

Manually escaping dynamic labels before concatenation.
label = "Person\\u0060n"
# convert \u0060 to literal backtick and then escape backticks
escaped_label = label.replace("\\u0060", "`").replace("`", "``")

driver.execute_query(
    f"MATCH (p:`{escaped_label}` {{name: $name}}) RETURN p.name",
    name="Alice",
    database_="neo4j"
)

Another workaround, which avoids string concatenation, is using APOC procedures, such as apoc.merge.node, which supports dynamic labels and property keys.

Using apoc.merge.node to create a node with dynamic labels/property keys.
property_key = "name"
label = "Person"

driver.execute_query(
    "CALL apoc.merge.node($labels, $properties)",
    labels=[label], properties={property_key: "Alice"},
    database_="neo4j"
)
If you are running Neo4j in Docker, APOC needs to be enabled when starting the container. See APOC → Installation → Docker.

Logging

The driver logs messages through the native logging library to a logger named neo4j. To redirect log messages to standard output, use the watch function:

import sys
from neo4j.debug import watch

watch("neo4j", out=sys.stdout)
Example of log output upon driver connection
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,616  [#0000]  _: <POOL> created, routing address IPv4Address(('localhost', 7687))
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,616  [#0000]  _: <POOL> acquire routing connection, access_mode='WRITE', database='neo4j'
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,616  [#0000]  _: <ROUTING> checking table freshness (readonly=False): table expired=True, has_server_for_mode=False, table routers={IPv4Address(('localhost', 7687))} => False
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,616  [#0000]  _: <POOL> attempting to update routing table from IPv4Address(('localhost', 7687))
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,616  [#0000]  _: <RESOLVE> in: localhost:7687
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,617  [#0000]  _: <RESOLVE> dns resolver out: 127.0.0.1:7687
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,617  [#0000]  _: <POOL> _acquire router connection, database='neo4j', address=ResolvedIPv4Address(('127.0.0.1', 7687))
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,617  [#0000]  _: <POOL> trying to hand out new connection
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,617  [#0000]  C: <OPEN> 127.0.0.1:7687
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,619  [#AF18]  C: <MAGIC> 0x6060B017
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,619  [#AF18]  C: <HANDSHAKE> 0x00000005 0x00020404 0x00000104 0x00000003
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,620  [#AF18]  S: <HANDSHAKE> 0x00000005
[DEBUG   ] [Thread 139807941394432] [Task None           ] 2023-03-31 09:31:39,620  [#AF18]  C: HELLO {'user_agent': 'neo4j-python/5.6.0 Python/3.10.6-final-0 (linux)', 'routing': {'address': 'localhost:7687'}, 'scheme': 'basic', 'principal': 'neo4j', 'credentials': '*******'}

Glossary

LTS

A Long Term Support release is one guaranteed to be supported for a number of years. Neo4j 4.4 is LTS, and Neo4j 5 will also have an LTS version.

Aura

Aura is Neo4j’s fully managed cloud service. It comes with both free and paid plans.

Cypher

Cypher is Neo4j’s graph query language that lets you retrieve data from the database. It is like SQL, but for graphs.

APOC

Awesome Procedures On Cypher (APOC) is a library of (many) functions that can not be easily expressed in Cypher itself.

Bolt

Bolt is the protocol used for interaction between Neo4j instances and drivers. It listens on port 7687 by default.

ACID

Atomicity, Consistency, Isolation, Durability (ACID) are properties guaranteeing that database transactions are processed reliably. An ACID-compliant DBMS ensures that the data in the database remains accurate and consistent despite failures.

eventual consistency

A database is eventually consistent if it provides the guarantee that all cluster members will, at some point in time, store the latest version of the data.

causal consistency

A database is causally consistent if read and write queries are seen by every member of the cluster in the same order. This is stronger than eventual consistency.

NULL

The null marker is not a type but a placeholder for absence of value. For more information, see Cypher → Working with null.

transaction

A transaction is a unit of work that is either committed in its entirety or rolled back on failure. An example is a bank transfer: it involves multiple steps, but they must all succeed or be reverted, to avoid money being subtracted from one account but not added to the other.

backpressure

Backpressure is a force opposing the flow of data. It ensures that the client is not being overwhelmed by data faster than it can handle.

transaction function

A transaction function is a callback executed by an execute_read or execute_write call. The driver automatically re-executes the callback in case of server failure.

Driver

A Driver object holds the details required to establish connections with a Neo4j database.