Knowledge Base

A note on OPTIONAL MATCHes

An OPTIONAL MATCH matches patterns against your graph database, just like a MATCH does. The difference is that if no matches are found, OPTIONAL MATCH will use a null for missing parts of the pattern. OPTIONAL MATCH could be considered the Cypher equivalent of the outer join in SQL.

Knowing this, one could write a cypher query as so:

OPTIONAL MATCH (actor:Actor)-[:ACTED_IN]->(:Movie)<-[:DIRECTED]-(director:Director {uuid:'fd787f45-df1a-4f8b-b4a9-ab7b90fefae4'})
OPTIONAL MATCH (director)-[:HAS_STUDIO*2]->(:Studio)-[:HAS_ACTOR]->(actor)-[:HAS_PROFILE]->(profile:Profile)
WHERE profile.type IN ["Drama", "Action"]
MERGE (director)-[:HAS]->(actor)
ON CREATE SET profile.uuid = apoc.create.uuid(), profile.lastTimeReported = '2018-12-15T23:13:22.274', profile.reportingState = 'Active'
ON MATCH SET profile.lastTimeReported = '2018-07-15T23:30:00.000', profile.reportingState = 'Active'
RETURN COUNT(profile) as updatedCount

However, this query may return an error such as:

Neo.DatabaseError.General.UnknownError: org.neo4j.values.storable.NoValue cannot be cast to org.neo4j.values.virtual.VirtualNodeValue

(pre 3.4.5)

or

Neo.DatabaseError.Statement.ExecutionFailed: Expected to find a node at ref slot 0 but found instead: null

(post 3.4.5)

This is because starting a query with an OPTIONAL MATCH doesn’t always play well with Neo4j, namely when your OPTIONAL MATCH returns no value and affects the rest of the query plan. The simplest form of query to trigger this is the following, when run on an empty database:

OPTIONAL MATCH (a)
MERGE (a)-[:X]->()

Messages thrown in these cases are the following:

Expected to find a node at $name but found instead: null
Expected to find a node at $name but found instead: $x
Expected to find a node at $name but found instead: null

When faced with one of these messages, the problem is related to the optional matches and can be solved by starting the query with a normal MATCH such as:

MATCH (actor:Actor)-[:ACTED_IN]->(:Movie)<-[:DIRECTED]-(director:Director {uuid:'fd787f45-df1a-4f8b-b4a9-ab7b90fefae4'})
OPTIONAL MATCH (director)-[:HAS_STUDIO*2]->(:Studio)-[:HAS_ACTOR]->(actor)-[:HAS_PROFILE]->(profile:Profile)
WHERE profile.type IN ["Drama", "Action"]
MERGE (director)-[:HAS]->(actor)
ON CREATE SET profile.uuid = apoc.create.uuid(), profile.lastTimeReported = '2018-12-15T23:13:22.274', profile.reportingState = 'Active'
ON MATCH SET profile.lastTimeReported = '2018-07-15T23:30:00.000', profile.reportingState = 'Active'
RETURN COUNT(profile) as updatedCount

If you have a long OPTIONAL MATCH we suggest you break it down so that you turn it into (at least) one MATCH followed by the OPTIONAL MATCH(es). You should MATCH the portions that must be present, and then the OPTIONAL MATCH(es) on the pieces of the pattern that may not be present.

For the example above, the full pattern in the first line is needed, since we want the director and its actors at a minimum. In other cases, maybe only a single starting node is needed at a minimum. Rule of thumb is starting with a MATCH to attempt to capture the minimum part of the graph needed, then use OPTIONAL MATCH(es) on anything else that is optional.