What is Neo4j?

neo4j architecture diagram

Neo4j is a native graph database, which means that it implements a true graph model all the way down to the storage level. Instead of using a "graph abstraction" on top of another technology, the data is stored in Neo4j in the same way you may whiteboard your ideas.

Since 2007, Neo4j has evolved into a rich ecosystem of tools, applications, and libraries. This ecosystem allows you to integrate graph technologies with your working environment in a number of ways which are here described.

Beyond the core graph, Neo4j also provides ACID transactions, cluster support, and runtime failover.

Neo4j is written in Java and Scala. You can check the source code on GitHub.

How to interact with Neo4j

Neo4j uses Cypher®, a declarative query language similar to SQL, but optimized for graphs. The same language is also used by other databases such as SAP HANA Graph via the openCypher project.

Another option is to use libraries. Neo4j currently supports Java, JavaScript, .NET, Python, Go, GraphQL, Spring, and more.

Create a Neo4j instance

Deploying a database is the first step towards exploring Neo4j. Select a deployment method that suits your project from the following options:

Fully managed cloud service

Neo4j AuraDB is a fully managed cloud service that allows you to start exploring Neo4j right from your browser.

If you are a data scientist, you might also want to check Neo4j AuraDS and get access to more than 65 pretuned graph algorithms.

Neo4j Aura has both free and subscription-based editions. See full comparison.

Self-managed cloud services

You can also deploy your graph database on a cloud platform of your choice. Neo4j works with Amazon Web Services (AWS), Google Cloud (GCP), and Microsoft Azure.

For self-managed cloud services, you need to install Neo4j locally or use Neo4j Desktop if your project is not in a production environment.

Neo4j is available for installation on Linux, macOs, and Windows.

Self-managed local deployment

If you prefer to work with a local deployment: install Neo4j Desktop if you are not working in a production environment or install Neo4j locally.

Neo4j on Docker

Neo4j can be run in a Docker container. An official Neo4j image that provides a standard, ready-to-run package of Neo4j Community Edition and Enterprise Edition for a variety of versions can be downloaded from the DockerHub. It is available for macOS, Windows, and Linux.

Neo4j on Kubernetes

With Neo4j Helm charts, you can deploy both a standalone and a cluster deployment of Neo4j on Kubernetes, and use configuration options suitable for the most common scenarios.

Neo4j has free and subscription-based licensing options. Read more about the available editions.

Work with data

After creating your database, your learning can take different paths depending on whether you want to work with your own data or use Neo4j’s example datasets:

  • Own data: There are several ways to import data to Neo4j and to model it for a better experience.

  • Example datasets: Both Aura and Neo4j Browser feature embedded guides that allow you to create example datasets and start querying. To access them, use the graduation cap icon on the top right section in Aura or write :guide in Neo4j Browser.
    You can also download the example datasets and then import them to your instance.

Neo4j tools

Neo4j has a catalogue of tools that can be used for various ends such as database administration, data visualization, and more. You can check all products in the Tools hub.

Supported libraries

Neo4j supports several of the most the popular query languages and also offers proprietary libraries for a customized experience:

  • The Neo4j Graph Data Science (GDS) library provides implementations of common graph algorithms and machine learning pipelines to train predictive supervised models. You can use them to solve graph problems, such as predicting missing relationships, for example.

  • The Object Graph Mapping (OGM) library, maps nodes and relationships in the graph to objects and references in a domain model. You can use this resource to start tracking changes and minimize necessary updates and transitive persistence (reading and updating neighborhoods of an object).

APIs

Neo4j currently offers three proprietary APIs:

  • The Neo4j HTTP API allows you to execute a series of Cypher statements against a Neo4j instance through HTTP requests.

  • The Change Data Capture (CDC) API allows you to capture and track changes to your database in real-time, as well as keep data sources up to date.

  • The Neo4j Query API allows you to develop client applications in languages not currently supported by Neo4j.

At Neo4j Labs, you can find experimental projects including APIs, libraries, and visualization tools.

Keep learning

To learn more about what a graph database is and the concepts behind the technology, continue reading the documentation or browse other curated resources.

You can also reach out to other members of the Neo4j community on the Neo4j Community Site.

Glossary

label

Marks a node as a member of a named and indexed subset. A node may be assigned zero or more labels.

labels

A label marks a node as a member of a named and indexed subset. A node may be assigned zero or more labels.

node

A node represents an entity or discrete object in your graph data model. Nodes can be connected by relationships, hold data in properties, and are classified by labels.

nodes

A node represents an entity or discrete object in your graph data model. Nodes can be connected by relationships, hold data in properties, and are classified by labels.

relationship

A relationship represents a connection between nodes in your graph data model. Relationships connect a source node to a target node, hold data in properties, and are classified by type.

relationships

A relationship represents a connection between nodes in your graph data model. Relationships connect a source node to a target node, hold data in properties, and are classified by type.

property

Properties are key-value pairs that are used for storing data on nodes and relationships.

properties

Properties are key-value pairs that are used for storing data on nodes and relationships.

cluster

A Neo4j DBMS that spans multiple servers working together to increase fault tolerance and/or read scalability. Databases on a cluster may be configured to replicate across servers in the cluster thus achieving read scalability or high availability.

clusters

A Neo4j DBMS that spans multiple servers working together to increase fault tolerance and/or read scalability. Databases on a cluster may be configured to replicate across servers in the cluster thus achieving read scalability or high availability.

graph

A logical representation of a set of nodes where some pairs are connected by relationships.

graphs

A logical representation of a set of nodes where some pairs are connected by relationships.

schema

The prescribed property existence and datatypes for nodes and relationships.

schemas

The prescribed property existence and datatypes for nodes and relationships.

[[database schema]]database schema

The prescribed property existence and datatypes for nodes and relationships.

indexes

Data structure that improves read performance of a database. Read more about supported categories of indexes.

indexed

Data structure that improves read performance of a database. Read more about supported categories of indexes.

constraints

Constraints are sets of data modeling rules that ensure the data is consistent and reliable. See what constraints are available in Cypher.