Parsing
This page provides a general overview of how Cypher® parses an input STRING
.
The Cypher parser takes an arbitrary input STRING
.
While the syntax of Cypher is described in subsequent chapters, the following details the general rules on which characters are considered valid input.
Using unicodes in Cypher
Unicodes can generally be escaped as \uxxx
.
Additional documentation on escaping rules for STRING
literals, names and regular expressions can be found here:
The following example escapes the unicode character A
(\u0041
) in the keyword MATCH
:
M\u0041TCH (m) RETURN m;
The Unicode version used by Cypher depends on the running JVM version.
Neo4j version | JVM compliancy | Unicode version |
---|---|---|
3.x |
Java SE 8 Platform Specification |
Unicode 6.2 |
4.x |
Java SE 11 Platform Specification |
Unicode 10.0 |
5.x |
Java SE 17 Platform Specification |
Unicode 13.0 |
5.14 |
Java SE 17 and Java SE 21 Platform Specification |
Unicode 13.0 and Unicode 15.0 |
Supported whitespace
Whitespace can be used as a separator between keywords and has no semantic meaning. The following unicode characters are considered as whitespace:
Description | List of included Unicode characters |
---|---|
Unicode general category Zp |
|
Unicode general category Zs |
|
Unicode general category class Zl |
|
Horizontal tabulation |
|
Line feed |
|
Vertical tabulation |
|
Form feed |
|
Carriage return |
|
File separator |
|
Group separator |
|
Record separator |
|
Unit separator |
|
It is possible to have multiple whitespace characters in a row, and will have the same effect as using a single whitespace.
The following example query uses vertical tabulation (\u000B
) as whitespace between the RETURN
keyword and the variable m
:
MATCH (m) RETURN\u000Bm;
Supported newline characters
A newline character identifies a new line in the query and is also considered whitespace. The supported newline characters in Cypher are:
Description | List of included Unicode characters |
---|---|
Line feed |
|
Carriage return |
|
Carriage return + line feed |
|