After several years of work, the GQL standard has been published. The path started by ISO in September 2019 hit its last major milestone in March 2024, with unanimous approval of the Final Draft International Standard (FDIS) ballot for GQL. This is a very exciting moment for everyone involved with graph databases. SQL is no longer the only ISO standard for database query languages: it now has a younger (and better-looking) sibling.
Two initial questions will likely spring to mind for any graph database practitioner.
The first question is easy to answer: no, GQL has nothing to do with GraphQL, in the same way that GraphQL has nothing to do with graphs. The unfortunate/cheeky/clever (pick one) name clash is just the first hurdle to overcome.
The second question requires a longer answer. Neo4j is fully committed to making the GQL standard a resounding success for the graph database industry. To do that, we want to make the transition as painless as possible for our customers. What will happen to my Cypher queries? is an important question to answer and is the subject of this blog post.
How GQL Impacts Cypher
Cypher is the property graph query language created by Neo4j. The intent at the time was to emulate SQL where possible but innovate where necessary. Its specification was open-sourced via the openCypher project around 2015 and implemented by several other graph products. It is undoubtedly the current de facto standard for property graph query languages. The overwhelming majority of graph database users write queries in Cypher. This blog post focuses on Neo4j’s proprietary Cypher implementation. A future post will discuss the future of openCypher project in the age of GQL. The GQL ISO standard drew inspiration from several existing languages (see the GQL Manifesto), with Cypher/openCypher being significant influences. As a result, Cypher and GQL are quite similar. Over the years, Cypher has proven its power as a graph query language through real-world usage. GQL has built upon Cypher’s strengths, incorporating tweaks to better align with SQL and ensure its long-term viability as a database language. Some of these improvements are exciting, and we’ll discuss them further below.
To provide users with the smoothest possible transition, Neo4j has decided to organically evolve the Cypher language toward GQL compliance: Neo4j will not offer a separate alternative query language to Cypher, but will make Cypher a GQL-compliant implementation.
The GQL standard, like the SQL standard, does not prevent language extensions. Anything not covered by the standard is fair game for implementers, much like it is for SQL. Keep in mind that this is the first version of the GQL standard, which, from a standing start, had to cover a lot of ground. Not every desired language feature made the cut, and despite that, the GQL standard is more than 600 pages, supported by over 430 technical papers (just to put the numbers in context, the GQL:2024 standard is approximately the same number of pages as SQL-92). Big standards are hard to implement, and the wide availability of good implementations is an essential measure of success.
There are features in Cypher that did not make it into the standard and might come up in a future standard release, or not. Still, those Cypher features will remain available to Neo4j users and continue to be fully supported as part of our overall commitment to supporting Cypher. Their use has no negative impact on the GQL compliance of Neo4j. More Cypher extensions will likely be developed over time.
The GQL standard includes both mandatory and optional features. For an implementation to be considered conformant, it must support all mandatory features. However, the long-term expectation is that most GQL implementations will support not only the mandatory features but also most of the optional ones.
If this or future versions of the GQL standard specify features that aren’t implemented in Neo4j Cypher, we will consider adding those features based on customer priorities, just as we’ve always done with our products.
In summary, Cypher GQL compliance will not stop any existing Cypher query from working and will allow Cypher to keep evolving to satisfy users’ demands.
You Say Potayto, I Say Potahto1
GQL and Cypher are like different pronunciations of the same language.
GQL shares with Cypher the query execution model based on linear composition. It also shares the pattern-matching syntax that is at the heart of Cypher, as well as many of the Cypher keywords (which as pointed out earlier were originally sourced from SQL). Variable bindings are passed between statements to enable the chaining of multiple data fetching and updating operations. And since most of the statements are the same, many Cypher queries are also GQL queries. As a simple example, the well-trodden Cypher below is also GQL:
MATCH (a:Actor)-[:ACTED_IN]->(m:Movie) WHERE a.name = 'Tom Hanks' RETURN m.title
While working on the ISO standard draft, we started implementing some of the early and exciting GQL features not yet in Cypher. Some of the features were on our to-do list, and the standard gave us the opportunity to implement them in a GQL-compliant way. For example, while we did not call out GQL explicitly at the time of their release, the recent improvements to graph pattern matching like the ability to have WHERE clauses inside node expressions (Neo4j 4.4) or inside relationship expressions (Neo4j 5.0), richer label expressions (Neo4j 5.0), and more sophisticated repetitions of patterns with quantified path patterns (Neo4j 5.9) are all examples of GQL features that have already been added to Neo4j!
We have also implemented some more minor but valuable GQL additions, such as the Unicode normalize() function and normalization predicates (Neo4j 5.17).
We have also started accommodating the GQL standard by offering SQL-like synonyms of the Cypher terminology. For example, you can now (Neo4j 5.18) INSERT nodes and relationships and get the same results you would get if you CREATE nodes and relationships. You will notice that GQL tries to unify some of the terminology with SQL. But don’t despair, the existing Cypher terminology is here to stay, so you can change it if you want to.
And obviously, there are more GQL features on their way.
Some changes will involve Cypher users a bit more. A few GQL features might modify aspects of existing queries’ behavior (e.g., different error codes). As such, we classify these GQL features as possible breaking changes. We are working hard to introduce these GQL changes in the Neo4j product in the least disruptive way possible.
The Future is Bright for GQL and Cypher
In conclusion, Neo4j is committed to making the Cypher implementation GQL compliant and doing that incrementally and non-disruptively. Existing Neo4j users already have access to the majority of the GQL features, and more of them are being continuously added to the Cypher language. Cypher queries will keep working, the language will get better, more powerful, and more GQL-compliant.
GQL is born; long live Cypher!
To learn more, the following blogs and documents provide additional information about the GQL standard, Neo4j Cypher, and openCypher:
1 Or, since we are talking about precise standards, You say /ˌpəˈteɪtoʊ/, I say /ˌpəˈtɑːtoʊ/