Migration from Legacy to new Cypher projection
Who should read this guide
This guide is intended for users who have been using the Legacy Cypher projection gds.graph.project.cypher
.
Cypher projections are now done using the gds.graph.project
aggregation function.
We assume that most of the mentioned operations and concepts can be understood with little explanation.
Thus we are intentionally brief in the examples and comparisons.
Please see the documentation for the Cypher projection for more details.
Structural Changes
The Legacy Cypher projection is a standalone procedure call where Cypher queries are passed as string arguments and executed by GDS. The new Cypher projection is an aggregation function that is called as part of a Cypher query. GDS is no longer responsible or in control of the execution of the Cypher queries. Migrating to the new Cypher projection requires changes of how the Cypher query is written as a whole.
There are no longer separate queries for nodes and relationships.
Instead, write one query that produces the source- and target node pairs and use gds.graph.project
to aggregate into the graph catalog.
Since the relationship query from the Legacy Cypher projection already required you to return the source- and target node pairs, it is a good starting point for the new query.
Roughly speaking, the query has to be rewritten as follows:
Legacy | New |
---|---|
|
|
The query no longer needs to adhere to a certain structure and you can use any Cypher query that produces the source- and target node pairs.
Semantic Changes
The Legacy Cypher projections has separate queries for nodes and relationships. The nodes query is executed first and defines all the nodes in the graph. The relationships query is executed second and the previously imported nodes act as a filter for the relationships. Only relationships between the previously imported nodes are imported into the graph. Any node that was imported as part of the node query, but does not appear in any of the relationships, results in a disconnected node in the graph. By default, all nodes are disconnected unless they also appear in a relationship.
The new Cypher projection does not have separate queries for nodes and relationships.
The node query is no longer needed and nodes are implicitly created from the source- and target node pairs.
Disconnected nodes have to be explicitly created in the query by providing NULL
in place of the target node.
By default, all nodes are connected unless they are explicitly disconnected.
Since the new Cypher projection is no longer in charge of executing the Cypher queries, the graph configuration can no longer return the node- and relationship queries.
Examples
The following examples are based on the examples listed in the documentation for the Legacy Cypher projection and the new Cypher projection.
Simple graph
Legacy | New |
---|---|
: Simple graph projection with potentially disconnected nodes |
|
|
|
: Simple graph projection without disconnected nodes |
|
Not applicable, Legacy Cypher projection cannot guarantee connected nodes. |
|
The direct translation requires the use of an OPTIONAL MATCH
clause to create disconnected nodes in order to create an identical graph.
This may not have been what you wanted originally, but was required since the Legacy Cypher projection could not guarantee connected nodes.
By using what is equivalent to the $relationshipQuery
, we now also get only connected nodes in the new Cypher projection.
Another difference is that we pass the nodes directly to the new Cypher projection.
The Legacy Cypher projection required us to pass the node ids.
By passing the nodes directly, the Cypher projection knows that the source for the projection is a Neo4j database and it enables the use of .write
procedures.
It is also possible to pass node ids instead of nodes … gds.graph.project('persons', id(n), id(m))
, but this is only recommended if the source for the projection is not a neo4j database.
See Arbitrary source and target id values for more details.
Multi-graph
Legacy | New |
---|---|
: Multi-graph projection |
|
|
|
Similar to the previous example, we have to use an OPTIONAL MATCH
clause to create disconnected nodes in order to create an identical graph.
The query can also look different depending on the actual graph schema and whether disconnected nodes are desired.
Node labels and relationship types are passed as an additional configuration map to the new Cypher projection.
Node labels need to be passed as sourceNodeLabels
and targetNodeLabels
and relationship types need to be passed as relationshipType
.
See Multi-graph for more details.
Node properties
Legacy | New |
---|---|
: Graph projection with node properties |
|
|
|
: Graph projection with optional node properties |
|
|
|
Similar to the previous example, we pass the labels and properties in an additional map. We can use map projections as well as any other Cypher expression to create the properties. See Node properties for more details.
Relationship properties
Legacy | New |
---|---|
: Graph projection with relationship properties |
|
|
|
Similar to the previous example, we pass properties in an additional map, here using the relationshipProperties
key.
We can use map projections as well as any other Cypher expression to create the properties.
See Relationship properties for more details.
Parallel Relationship
Legacy | New |
---|---|
: Graph projection with parallel relationships |
|
|
|
: Graph projection with parallel relationship and relationship properties |
|
|
|
Similar to Legacy Cypher projections, there is no mechanism to let GDS aggregate parallel relationships. Aggregations over parallel relationships are done in the query by any means that are appropriate for the graph schema and data. See Parallel relationship for more details.
Projecting filtered graphs
Legacy | New |
---|---|
: Graph projection with filtered graphs |
|
|
|
Similar to Legacy Cypher projections, we can apply any Cypher method of filtering the data before passing it on to the Cypher projection. See Projecting filtered Neo4j graphs for more details.
Projecting undirected graphs
Legacy | New |
---|---|
: Graph projection with undirected graphs |
|
Not applicable, Legacy Cypher projection cannot project undirected graphs. |
|
The new Cypher projection can project undirected graphs. See Undirected relationships for more details.
Memory estimation
Legacy | New |
---|---|
: Memory estimation of projected graphs |
|
|
|
Since the new Cypher projection is no longer a procedure, there is also no .estimate
method.
Instead, we can use the gds.graph.project.estimate
procedure to estimate the memory requirements of the graph projection.