Projecting graphs and using the graph catalog
Follow along with a notebook in Google Colab |
This example shows how to:
-
load Neo4j on-disk data into in-memory projected graphs;
-
use the graph catalog to manage projected graphs.
Setup
For more information on how to get started using Python, refer to the Connecting with Python tutorial.
pip install graphdatascience
# Import the client
from graphdatascience import GraphDataScience
# Replace with the actual URI, username, and password
AURA_CONNECTION_URI = "neo4j+s://xxxxxxxx.databases.neo4j.io"
AURA_USERNAME = "neo4j"
AURA_PASSWORD = ""
# Configure the client with AuraDS-recommended settings
gds = GraphDataScience(
AURA_CONNECTION_URI,
auth=(AURA_USERNAME, AURA_PASSWORD),
aura_ds=True
)
In the following code examples we use the print
function to print Pandas DataFrame
and Series
objects. You can try different ways to print a Pandas object, for instance via the to_string
and to_json
methods; if you use a JSON representation, in some cases you may need to include a default handler to handle Neo4j DateTime
objects. Check the Python connection section for some examples.
For more information on how to get started using the Cypher Shell, refer to the Neo4j Cypher Shell tutorial.
Run the following commands from the directory where the Cypher shell is installed. |
export AURA_CONNECTION_URI="neo4j+s://xxxxxxxx.databases.neo4j.io"
export AURA_USERNAME="neo4j"
export AURA_PASSWORD=""
./cypher-shell -a $AURA_CONNECTION_URI -u $AURA_USERNAME -p $AURA_PASSWORD
For more information on how to get started using Python, refer to the Connecting with Python tutorial.
pip install neo4j
# Import the driver
from neo4j import GraphDatabase
# Replace with the actual URI, username, and password
AURA_CONNECTION_URI = "neo4j+s://xxxxxxxx.databases.neo4j.io"
AURA_USERNAME = "neo4j"
AURA_PASSWORD = ""
# Instantiate the driver
driver = GraphDatabase.driver(
AURA_CONNECTION_URI,
auth=(AURA_USERNAME, AURA_PASSWORD)
)
# Import to prettify results
import json
# Import for the JSON helper function
from neo4j.time import DateTime
# Helper function for serializing Neo4j DateTime in JSON dumps
def default(o):
if isinstance(o, (DateTime)):
return o.isoformat()
Load data from Neo4j with native projections
Native projections are used to load into memory a graph stored on disk.
The gds.graph.project
procedure allows to project a graph by selecting the node labels, relationship types and properties to be projected.
The gds.graph.project
procedure can use a "shorthand syntax", where the nodes and relationships projections are simply passed as single values or arrays, or an "extended syntax", where each node or relationship projection has its own configuration.
The extended syntax is especially useful if additional transformation of the data or the graph structure are needed.
Both methods are shown in this section, using the following graph as an example.
# Cypher query to create an example graph on disk
gds.run_cypher("""
MERGE (a:EngineeringManagement {name: 'Alistair'})
MERGE (j:EngineeringManagement {name: 'Jennifer'})
MERGE (d:Developer {name: 'Leila'})
MERGE (a)-[:MANAGES {start_date: 987654321}]->(d)
MERGE (j)-[:MANAGES {start_date: 123456789, end_date: 987654321}]->(d)
""")
MERGE (a:EngineeringManagement {name: 'Alistair'})
MERGE (j:EngineeringManagement {name: 'Jennifer'})
MERGE (d:Developer {name: 'Leila'})
MERGE (a)-[:MANAGES {start_date: 987654321}]->(d)
MERGE (j)-[:MANAGES {start_date: 123456789, end_date: 987654321}]->(d)
# Cypher query to create an example graph on disk
write_example_graph_query = """
MERGE (a:EngineeringManagement {name: 'Alistair'})
MERGE (j:EngineeringManagement {name: 'Jennifer'})
MERGE (d:Developer {name: 'Leila'})
MERGE (a)-[:MANAGES {start_date: 987654321}]->(d)
MERGE (j)-[:MANAGES {start_date: 123456789, end_date: 987654321}]->(d)
"""
# Create the driver session
with driver.session() as session:
session.run(write_example_graph_query)
Project using the shorthand syntax
In this example we use the shorthand syntax to simply project all node labels and relationship types.
# Project a graph using the shorthand syntax
shorthand_graph, result = gds.graph.project(
"shorthand-example-graph",
["EngineeringManagement", "Developer"],
["MANAGES"]
)
print(result)
CALL gds.graph.project(
'shorthand-example-graph',
['EngineeringManagement', 'Developer'],
['MANAGES']
)
YIELD graphName, nodeCount, relationshipCount
RETURN *
shorthand_graph_create_call = """
CALL gds.graph.project(
'shorthand-example-graph',
['EngineeringManagement', 'Developer'],
['MANAGES']
)
YIELD graphName, nodeCount, relationshipCount
RETURN *
"""
# Create the driver session
with driver.session() as session:
# Call to project a graph using the shorthand syntax
result = session.run(shorthand_graph_create_call).data()
# Prettify the result
print(json.dumps(result, indent=2, sort_keys=True))
Project using the extended syntax
In this example we use the extended syntax for node and relationship projections to:
-
transform the
EngineeringManagement
andDeveloper
labels toPersonEM
andPersonD
respectively; -
transform the directed
MANAGES
relationship into theKNOWS
undirected relationship; -
keep the
start_date
andend_date
relationship properties, adding a default value of999999999
toend_date
.
The projected graph becomes the following:
(:PersonEM {first_name: 'Alistair'})-
[:KNOWS {start_date: 987654321, end_date: 999999999}]-
(:PersonD {first_name: 'Leila'})-
[:KNOWS {start_date: 123456789, end_date: 987654321}]-
(:PersonEM {first_name: 'Jennifer'})
# Project a graph using the extended syntax
extended_form_graph, result = gds.graph.project(
"extended-form-example-graph",
{
"PersonEM": {
"label": "EngineeringManagement"
},
"PersonD": {
"label": "Developer"
}
},
{
"KNOWS": {
"type": "MANAGES",
"orientation": "UNDIRECTED",
"properties": {
"start_date": {
"property": "start_date"
},
"end_date": {
"property": "end_date",
"defaultValue": 999999999
}
}
}
}
)
print(result)
CALL gds.graph.project(
'extended-form-example-graph',
{
PersonEM: {
label: 'EngineeringManagement'
},
PersonD: {
label: 'Developer'
}
},
{
KNOWS: {
type: 'MANAGES',
orientation: 'UNDIRECTED',
properties: {
start_date: {
property: 'start_date'
},
end_date: {
property: 'end_date',
defaultValue: 999999999
}
}
}
}
)
YIELD graphName, nodeCount, relationshipCount
RETURN *
extended_form_graph_create_call = """
CALL gds.graph.project(
'extended-form-example-graph',
{
PersonEM: {
label: 'EngineeringManagement'
},
PersonD: {
label: 'Developer'
}
},
{
KNOWS: {
type: 'MANAGES',
orientation: 'UNDIRECTED',
properties: {
start_date: {
property: 'start_date'
},
end_date: {
property: 'end_date',
defaultValue: 999999999
}
}
}
}
)
YIELD graphName, nodeCount, relationshipCount
RETURN *
"""
# Create the driver session
with driver.session() as session:
# Call to project a graph using the extended syntax
result = session.run(extended_form_graph_create_call).data()
# Prettify the results
print(json.dumps(result, indent=2, sort_keys=True))
Use the graph catalog
The graph catalog can be used to retrieve information on and manage the projected graphs.
List all the graphs
The gds.graph.list
procedure can be used to list all the graphs currently stored in memory.
# List all in-memory graphs
all_graphs = gds.graph.list()
print(all_graphs)
CALL gds.graph.list()
show_in_memory_graphs_call = """
CALL gds.graph.list()
"""
# Create the driver session
with driver.session() as session:
# Run the Cypher procedure
results = session.run(show_in_memory_graphs_call).data()
# Prettify the results
print(json.dumps(results, indent=2, sort_keys=True, default=default))
Check that a graph exists
The gds.graph.exists
procedure can be called to check for the existence of a graph by its name.
# Check whether the "shorthand-example-graph" graph exists in memory
graph_exists = gds.graph.exists("shorthand-example-graph")
print(graph_exists)
CALL gds.graph.exists('example-graph')
check_graph_exists_call = """
CALL gds.graph.exists('example-graph')
"""
# Create the driver session
with driver.session() as session:
# Run the Cypher procedure and print the result
print(session.run(check_graph_exists_call).data())
Drop a graph
When a graph is no longer needed, it can be dropped to free up memory using the gds.graph.drop
procedure.
# Drop a graph object and keep the result of the call
result = gds.graph.drop(shorthand_graph)
# Print the result
print(result)
# Drop a graph object and just print the result of the call
gds.graph.drop(extended_form_graph)
CALL gds.graph.drop('shorthand-example-graph');
CALL gds.graph.drop('extended-form-example-graph');
delete_shorthand_graph_call = """
CALL gds.graph.drop('shorthand-example-graph')
"""
delete_extended_form_graph_call = """
CALL gds.graph.drop('extended-form-example-graph')
"""
# Create the driver session
with driver.session() as session:
# Drop a graph and keep the result of the call
result = session.run(delete_shorthand_graph_call).data()
# Prettify the result
print(json.dumps(result, indent=2, sort_keys=True, default=default))
# Drop a graph discarding the result of the call
session.run(delete_extended_form_graph_call).data()
Cleanup
When the projected graphs are dropped, the underlying data on the disk are not deleted. If such data are no longer needed, they need to be deleted manually via a Cypher query.
# Delete on-disk data
gds.run_cypher("""
MATCH (example)
WHERE example:EngineeringManagement OR example:Developer
DETACH DELETE example
""")
MATCH (example)
WHERE example:EngineeringManagement OR example:Developer
DETACH DELETE example;
delete_example_graph_query = """
MATCH (example)
WHERE example:EngineeringManagement OR example:Developer
DETACH DELETE example
"""
# Create the driver session
with driver.session() as session:
# Run Cypher call
print(session.run(delete_example_graph_query).data())
Closing the connection
The connection should always be closed when no longer needed.
Although the GDS client automatically closes the connection when the object is deleted, it is good practice to close it explicitly.
# Close the client connection
gds.close()
# Close the driver connection
driver.close()
References
Cypher
-
Learn more about the Cypher syntax
-
You can use the Cypher Cheat Sheet as a reference of all available Cypher features