Statistics and execution plans
When a Cypher query is issued, it gets compiled to an execution plan that can run and answer the query. The Cypher query engine uses the available information about the database, such as schema information about which indexes and constraints exist in the database. This page describes how to configure the Neo4j statistics collection and the query replanning in the Cypher query engine.
Neo4j also uses statistical information about the database to optimize the execution plan. For more information, see Cypher Manual → Query tuning and Cypher Manual → Execution plans. |
Configure statistics collection
The Cypher query planner depends on accurate statistics to create efficient plans. Therefore, these statistics are kept up-to-date as the database evolves.
For each database in the DBMS, Neo4j collects the following statistical information and keeps it up-to-date:
- For graph entities
-
-
The number of nodes with a certain label.
-
The number of relationships by type.
-
The number of relationships by type between nodes with a specific label.
-
These numbers are updated whenever you set or remove a label from a node.
- For database schema
-
-
Selectivity per index.
-
To produce a selectivity number, Neo4j runs a full index scan in the background. Because this could potentially be a very time-consuming operation, a full index scan is triggered only when the changed data reaches a specified threshold.
Automatic statistics collection
You can control whether and how often statistics are collected automatically by configuring the following settings:
Parameter name | Default value | Description |
---|---|---|
|
Enable the automatic (background) index sampling. |
|
|
Percentage of index updates of total index size required before sampling of a given index is triggered. |
Manual statistics collection
You can manually trigger index resampling by using the built-in procedures db.resampleIndex()
and db.resampleOutdatedIndexes()
.
db.resampleIndex()
-
Trigger resampling of a specified index.
CALL db.resampleIndex("indexName")
db.resampleOutdatedIndexes()
-
Trigger resampling of all outdated indexes.
CALL db.resampleOutdatedIndexes()
Configure the replanning of execution plans
Execution plans are cached and are not replanned until the statistical information used to produce the plan changes.
Automatic replanning
You can control how sensitive the replanning should be to database updates by configuring the following settings:
Parameter name | Default value | Description |
---|---|---|
|
The threshold for statistics above which a plan is considered stale. |
|
|
The minimum amount of time between two query replanning executions.
After this time, the graph statistics are evaluated, and if they have changed more than the value set in |
Manual replanning
You can manually force the database to replan the execution plans that are already in the cache by using the following built-in procedures:
db.clearQueryCaches()
-
Clear all query caches. Does not change the database statistics.
CALL db.clearQueryCaches()
db.prepareForReplanning()
-
Completely recalculates all database statistics to be used for any subsequent query planning.
The procedure triggers an index resampling, waits for it to complete, and clears all query caches. Afterwards, queries are planned based on the latest database statistics.
CALL db.prepareForReplanning()
You can use Cypher replanning to specify whether you want to force a replan, even if the plan is valid according to the planning rules, or skip replanning entirely should you wish to use a valid plan that already exists.
For more information, see: