What is Cypher
This page covers the basics of Cypher®. For the complete documentation, refer to Cypher. |
Cypher is Neo4j’s declarative and GQL conformant query language. Available as open source via The openCypher project, Cypher is similar to SQL, but optimized for graphs.
Intuitive and close to natural language, Cypher provides a visual way of matching patterns and relationships by having its own design based on ASCII-art type of syntax:
(:nodes)-[:ARE_CONNECTED_TO]->(:otherNodes)
Round brackets are used to represent (:Nodes)
, and -[:ARROWS]→
to represent a relationship between the (:Nodes)
.
With this query syntax, you can perform create, read, update, or delete (CRUD) operations on your graph.
For a quick look with no installation required, get a free Aura instance. Use the graduation cap icon on the top right section to access the interactive guides. The "Query fundamentals" gives you a hands-on introduction to Cypher. |
How does Cypher work?
Neo4j’s graph model is composed of nodes and relationships, which may also have assigned properties. With nodes and relationships, you can build powerful patterns that can express simple or complex patterns.
Pattern recognition is a key fundamental cognitive process, making Cypher, which utilizes pattern matching, intuitive and easy to learn.
Cypher syntax
Cypher’s constructs are based on English prose and iconography. This makes queries easy both to write and to read.
If you were to represent the data in this graph in English, it might read as something like: "Sally likes Graphs. Sally is friends with John. Sally works for Neo4j."
Now, if you were to write this same information in Cypher, then it would look like this:
(:Sally)-[:LIKES]->(:Graphs)
(:Sally)-[:IS_FRIENDS_WITH]->(:John)
(:Sally)-[:WORKS_FOR]->(:Neo4j)
However, in order to have this information in the graph, first you need to represent it as nodes and relationships.
Nodes
In a property graph model, the main components are nodes and relationships.
Nodes are often used to represent nouns or objects in your data model.
In the previous example, Sally
, John
, Graphs
, and Neo4j
are the nodes:
In Cypher, you can depict a node by surrounding it with parentheses, e.g. (node)
.
The parentheses are a representation of the circles that compose the nodes in the visualization.
Node labels
Nodes can be grouped together through a label. They work like tags and allow you to specify certain types of entities to look for or to create. Labels also help Cypher distinguish between entities and optimize execution for your queries.
In the example, both Sally
and John
can be grouped under a Person
label, Graphs
can receive a Technology
label, and Neo4j
can be labeled as Company
:
Sally
, John
, Graphs
, and Neo4j
are now properties instead.In a relational database context, this would be the same as telling SQL which table to look for the particular row.
The same way you can tell SQL to query a person’s information from a Person
table, you can also tell Cypher to only check the Person
label for that information.
If you do not specify a label for Cypher to filter out non-matching node categories, the query will check all of the nodes in the database. This can affect performance in very large graphs. |
Node variables
Though not mandatory, variables are particularly useful when querying a database, as they allow referencing specified nodes in subsequent clauses without writing their label in full.
Variables can be single letters or words, and should be written in lower-case.
For example, if you want to bind all nodes labeled Person
to the variable p
, you write (p:Person)
.
Likewise, if you want to use a full word, then you can write (person:Person)
.
In a MATCH
query to retrieve all nodes labeled Person
, this is how it looks like:
Without variable | With variable |
---|---|
|
|
Note that in the example without a variable, the node Person
is preceded by a colon (:
).
This is how you prevent a type or label of becoming a variable.
In case you forget to add a colon and write the query like this:
MATCH (Person)
RETURN Person
Then Person
would be a variable, not a type or label.
Relationships
One of the benefits of graph databases is that you can store information about how elements (nodes) are related to each other in the form of relationships.
In Cypher, relationships are represented as square brackets and an arrow connecting two nodes (e.g. (Node1)-[]→(Node2)
).
In the example, the lines containing :LIKES
, :IS_FRIENDS_WITH
, and :WORKS_FOR
represent the relationship between the nodes:
Remember to always put a colon in front of a relationship type.
If you happen to forget it, and write a query such as |
Relationship directions
Relationships always have a direction which is indicated by an arrow.
They can go from left to right:
(p:Person)-[:LIKES]->(t:Technology)
From right to left:
(p:Person)<-[:LIKES]-(t:Technology)
Or be undirected (where the direction is not specified):
MATCH (p:Person)-[:LIKES]-(t:Technology)
Undirected relationships
An undirected relationship does not mean that it doesn’t have a direction, but that it can be traversed in either direction.
While you can’t create relationships without a direction, you can query them undirected (in the example, using the MATCH
clause).
Using undirected relationships in queries is particularly useful when you don’t know the direction, since Cypher won’t return anything if you write a query with the wrong direction. Cypher will therefore retrieve all nodes connected by the specified relationship type, regardless of direction.
Because undirected relationships in queries are traversed twice (once for each direction), the same pattern will be returned twice. This may impact the performance of the query. |
Relationship types
Relationship types categorize and add meaning to a relationship, similar to how labels group nodes together. It is considered best practice to use verbs or derivatives for the relationship type. The type describes how the nodes relate to each other. This way, Cypher is almost like natural language, where nodes are the subjects and objects (nouns), and the relationships (verbs) are the action words that relate them.
In the previous example, the relationship types are:
-
[:LIKES]
- communicates that Sally (a node) likes graphs (another node). -
[:IS_FRIENDS_WITH]
- communicates that Sally is friends with John. -
[:WORKS_FOR]
- communicates that Sally works for Neo4j.
Relationship variables
Variables can be used for relationships in the same way as for nodes. Once you specify a variable, you can use it later in the query to reference the relationship.
Take this example:
MATCH (p:Person)-[r:LIKES]->(t:Technology)
RETURN p,r,t
This query specifies variables for both the node labels (p
for Person
and t
for Technology
) and the relationship type (r
for :LIKES
).
In the return clause, you can then use the variables (i.e.p
, r
, and t
) to return the bound entities.
This would be your result:
p | r | t |
---|---|---|
|
|
|
Rows: 1 |
Remember to always put a colon in front of a relationship type. If you happen to forget it, and write the query like this:
(Person)-[LIKES]->(Technology)
[LIKES]
will represent a relationship variable, not a relationship type.
In this case, since no relationship type is declared, Cypher will search for all types of relationships in order to retrieve a result to your query.
Properties
Property values can be added both to nodes and relationships and be of a variety of data types. For a full list of values and types, see Cypher manual → Values and types.
Another way to organize the data in the previous example would be to add a property, name
, and Sally
and John
as property values on Person
-labeled nodes:
CREATE (p:Person {name:'Sally'})-[r:IS_FRIENDS_WITH]->(p:Person {name:'John'})
RETURN p, r
Properties are enclosed by curly brackets ({}
), the key is followed by a colon, and the value is enclosed by single or double quotation marks.
In case you have already added Sally and John as node labels, but want to change them into node properties, you need to refactor your graph. Refactoring is a strategy in data modeling that you can learn more about in this tutorial.
Patterns in Cypher
Graph pattern matching sits at the very core of Cypher. It is the mechanism used to navigate, describe, and extract data from a graph by applying a declarative pattern.
Consider this example:
(p:Person {name: "Sally"})-[r:LIKES]->(g:Technology {type: "Graphs"})
This bit of Cypher represents a pattern, but it is not a query.
It only expresses that a Person
node with Sally as its name
property has a LIKES
relationship to the Technology
node with Graphs as its type
property.
In order to do something with this pattern, such as adding it to or retrieving it from the graph, you need to query the database.
For example, you can add this information to the database using the CREATE
clause:
CREATE (p:Person {name: "Sally"})-[r:LIKES]->(t:Technology {type: "Graphs"})
And once this data is written to the database, you can retrieve it with this pattern:
MATCH (p:Person {name: "Sally"})-[r:LIKES]->(t:Technology {type: "Graphs"})
RETURN p,r,t
Patterns variables
In the same way as nodes and relationships, you can also use variables for patterns. For more information, refer to Cypher manual → Patterns → Syntax and Semantics.
Keep learning
Now that the basic Cypher concepts have been introduced, you can take the tutorial on how to Get started with Cypher to learn how to write your own queries. In the Cypher manual, you can find more information on:
-
How to write basic queries and what clauses you can use to read data from the database.
-
How patterns work and how you can use them to navigate, describe and extract data from a graph.
-
What values and types, and functions are available in Cypher.
From SQL to Cypher
In case you have a background in SQL and are new to graph databases, these are some resources for more information on the key differences and the transition to graphs:
From NoSQL to Graphs
If you are familiar with NoSQL ("Not only SQL") system, you can also learn more on how to make the transition to a graph database.
GraphAcademy
With the Cypher Fundamentals course, you can learn Cypher in 60 minutes and practice using a sandbox.
Other resources
For more suggestions on how to expand your knowledge about Cypher, refer to Resources.
Glossary
- label
-
Marks a node as a member of a named and indexed subset. A node may be assigned zero or more labels.
- labels
-
A label marks a node as a member of a named and indexed subset. A node may be assigned zero or more labels.
- node
-
A node represents an entity or discrete object in your graph data model. Nodes can be connected by relationships, hold data in properties, and are classified by labels.
- nodes
-
A node represents an entity or discrete object in your graph data model. Nodes can be connected by relationships, hold data in properties, and are classified by labels.
- relationship
-
A relationship represents a connection between nodes in your graph data model. Relationships connect a source node to a target node, hold data in properties, and are classified by type.
- relationships
-
A relationship represents a connection between nodes in your graph data model. Relationships connect a source node to a target node, hold data in properties, and are classified by type.
- property
-
Properties are key-value pairs that are used for storing data on nodes and relationships.
- properties
-
Properties are key-value pairs that are used for storing data on nodes and relationships.
- cluster
-
A Neo4j DBMS that spans multiple servers working together to increase fault tolerance and/or read scalability. Databases on a cluster may be configured to replicate across servers in the cluster thus achieving read scalability or high availability.
- clusters
-
A Neo4j DBMS that spans multiple servers working together to increase fault tolerance and/or read scalability. Databases on a cluster may be configured to replicate across servers in the cluster thus achieving read scalability or high availability.
- graph
-
A logical representation of a set of nodes where some pairs are connected by relationships.
- graphs
-
A logical representation of a set of nodes where some pairs are connected by relationships.
- schema
-
The prescribed property existence and datatypes for nodes and relationships.
- schemas
-
The prescribed property existence and datatypes for nodes and relationships.
- [[database schema]]database schema
-
The prescribed property existence and datatypes for nodes and relationships.
- indexes
-
Data structure that improves read performance of a database. Read more about supported categories of indexes.
- indexed
-
Data structure that improves read performance of a database. Read more about supported categories of indexes.
- constraints
-
Constraints are sets of data modeling rules that ensure the data is consistent and reliable. See what constraints are available in Cypher.
- data model
-
A data model defines how information is organized in a database. A good data model will make querying and understanding your data easier. In Neo4j, the data models have a graph structure.
- data models
-
A data model defines how information is organized in a database. A good data model will make querying and understanding your data easier. In Neo4j, the data models have a graph structure.