Better, Faster, and More Scalable Neo4j than ever before
Neo4j 2.2 aims to be our fastest and most scalable release ever. With Neo4j 2.2 our engineering team introduces massive enhancements to the internal architecture resulting in higher performance and scalability.
This first milestone (or beta release) pulls all of these new elements together, so that you can “dial it up to 11” with your applications.
You can download it here for your testing.
Three of the key areas being tackled in this release are:
1. Highly Concurrent Performance
With Neo4j 2.2, we introduce a brand new page cache designed to deliver extreme performance and scalability under highly concurrent workloads. This new page cache helps overcome the limitations imposed by the current IO systems to support larger applications with hundreds of read and/or write IO requirements. The new cache is auto-configured and matches the available memory without the need to tune memory mapped IO settings anymore.
2. Transactional & Batch Write Performance
We have made several enhancements in Neo4j 2.2 to improve both transactional and batch write performance by orders of magnitude under highly concurrent load. Several things are changing to make this happen.
- First, the 2.2 release improves coordination of commits between Lucene, the graph, and the transaction log, resulting in a much more efficient write channel.
- Next, the database kernel is enhanced to optimize the flushing of transactions to disk for high number of concurrent write threads. This allows throughput to improve significantly with more write threads since IO costs are spread across transactions. Applications with many small transactions being piped through large numbers (10-100+) of concurrent write threads will experience the greatest improvement.
- Finally, we have improved and fully integrated the “Superfast Batch Loader”. Introduced in Neo4j 2.1, this utility now supports large scale non-transactional initial loads (of 10M to 10B+ elements) with sustained throughputs around 1M records (node or relationship or property) per second. This seriously fast utility is (unsurprisingly) called
neo4j-import
, and is accessible from the command line.
3. Cypher Performance
We’re very excited to be releasing the
first version of a new Cost-Based Optimizer for Cypher, under development for nearly a year. While Cypher is hands-down the most convenient way to formulate queries, it hasn’t always been as fast as we’d like. Starting with Neo4j 2.2, Cypher will determine the optimal query plan by using statistics about your particular data set. Both the cost-based query planner, and the ability of the database to gather statistics, are new, and we’re very interested in your feedback. Sample queries & data sets are welcome!
Despite the strong focus on performance & scalability, we delivered some functional improvements too:
Cypher Profiling
As part of work on the Cypher planner, we extended the profiling output in the neo4j-shell. You can now choose to only
EXPLAIN
or fully
PROFILE
a query but just prefixing it with one of the keywords. And you can manually select a query planner with CYPHER 2.2-cost and CYPHER 2.2-rule prefixes.
Neo4j Browser UI
Many small improvements have been made to the UI, including panning, and the ability to kill a running Cypher query. (Query killing is also supported in the Neo4j Shell using CTRL-C.) Please explore these and tell us what you think of them. The Neo4j Browser also handles long running queries more reliably, which is especially important for imports with
LOAD CSV
.
Basic Authentication
We’ve received requests for a variety of authentication features, and while these are largely planned for the next (Neo4j 2.3) release, we are very pleased to introduce token-based authentication in Neo4j 2.2.
This is enabled by default in Neo4j 2.2 M, which means you must either (a) explicitly disable it in
conf/neo4j-server.properties
(if you have assured security using another means), or (b) change your app to use the authentication token.
We default setting in the milestone release is by design to ensure we get your indispensable feedback!
Some things to be aware of
As with all milestones:
do not run this in production. This is an early access version to help you prepare for the upcoming production release and provide feedback on the new features. Make sure you refer to the
manual for information about new features (especially
token-based authentication,
neo4j-import, and
Cypher optimization and statistics gathering).
We are eager to hear your feedback. Please post it to the
Neo4j Google Group, or send us a direct email at
feedback@neotechnology.com.
Enjoy, and please tell us what you discover!
Philip Rathle, VP Product
on behalf of the Neo4j Team
Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.
Download My Ebook