Core concepts
Fundamentally, a Neo4j graph database consists of three core entities: nodes, relationships, and paths. Cypher® queries are constructed to either match or create these entities in a graph. Having a basic understanding of what nodes, relationships, and paths are in a graph database is therefore crucial in order to construct Cypher queries.
Nodes
The data entities in a Neo4j graph database are called nodes.
Nodes are referred to in Cypher using parentheses ()
.
MATCH (n:Person {name:'Anna'})
RETURN n.born AS birthYear
In the above example, the node includes the following:
-
A
Person
label. Labels are like tags, and are used to query the database for specific nodes. A node may have multiple labels, for examplePerson
andActor
. -
A
name
property set toAnna
. Properties are defined within curly braces,{}
, and are used to provide nodes with specific information, which can also be queried for and further improve the ability to pinpoint data. -
A variable,
n
. Variables allow referencing specified nodes in subsequent clauses.
In this example, the first MATCH
clause finds all Person
nodes in the graph with the name
property set to Anna
, and binds them to the variable n
.
The variable n
is then passed along to the subsequent RETURN
clause, which returns the value of a different property (born
) belonging to the same node.
Relationships
Nodes in a graph can be connected with relationships.
A relationship must have a start node, an end node, and exactly one type.
Relationships are represented in Cypher with arrows (e.g. -->
) indicating the direction of a relationship.
MATCH (:Person {name: 'Anna'})-[r:KNOWS WHERE r.since < 2020]->(friend:Person)
RETURN count(r) As numberOfFriends
Unlike nodes, information within a relationship pattern must be enclosed by square brackets.
The query example above matches for relationships of type KNOWS
and with the property since
set to less than 2020
.
The query also requires the relationships to go from a Person
node named Anna
to any other Person
nodes, referred to as friend
.
The count() function is used in the RETURN
clause to count all the relationships bound by the r
variable in the preceding MATCH
clause (i.e. how many friends Anna
has known since before 2020
).
Note that while nodes can have several labels, relationships can only have one type.
Paths
Paths in a graph consist of connected nodes and relationships. Exploring these paths sits at the very core of Cypher.
MATCH (n:Person {name: 'Anna'})-[:KNOWS]-{1,5}(friend:Person WHERE n.born < friend.born)
RETURN DISTINCT friend.name AS olderConnections
This example uses a quantified relationship to find all paths up to 5
hops away, traversing only relationships of type KNOWS
from the start node Anna
to other older Person
nodes (as defined by the WHERE clause).
The DISTINCT operator is used to ensure that the RETURN
clause only returns unique nodes.
Paths can also be assigned variables.
For example, the below query binds a whole path pattern, which matches the SHORTEST
path from Anna
to another Person
node in the graph with a nationality
property set to Canadian
.
In this case, the RETURN
clause returns the full path between the two nodes.
MATCH p = SHORTEST 1 (:Person {name: 'Anna'})-[:KNOWS]-+(:Person {nationality: 'Canadian'})
RETURN p
For more information about graph pattern matching, see Patterns.