Volume mounts and persistent volumes
Neo4j Helm chart uses volume mounts and persistent volumes to manage the storage of data and other Neo4j files.
Volume mounts
A volume mount is part of a Kubernetes Pod spec that describes how and where a volume is mounted within a container.
The Neo4j Helm chart creates the following volume mounts:
-
backupsmounted at /backups -
datamounted at /data -
importmounted at /import -
licensesmounted at /licenses -
logsmounted at /logs -
metricsmounted at /metrics (Neo4j Community Edition does not generatemetrics.)
It is also possible to specify a plugins volume mount (mounted at /plugins), but this is not created by the default Helm chart.
For more information, see Add plugins using a plugins volume.
Persistent volumes
PersistentVolume (PV) is a storage resource in the Kubernetes cluster that has a lifecycle independent of any individual pod that uses the PV.
PersistentVolumeClaim (PVC) is a request for a storage resource by a user.
PVCs consume PV resources.
For more information about what PVs are and how they work, see the Kubernetes official documentation.
The type of PV used and its configuration can have a significant effect on the performance of Neo4j. Some PV types are not suitable for use with Neo4j at all.
The volume type used for the data volume mount is particularly important.
Neo4j supports the following PV types for the data volume mount:
-
persistentVolumeClaim -
hostPathwhen using Docker Desktop [1].
Neo4j data volume mount does not support azureFile and nfs.
|
|
For volume mounts other than the data volume mount, generally, all PV types are presumed to work.
|
It is also not recommended to use an HDD or cloud storage, such as AWS S3 mounted as a drive. |
Mapping volume mounts to persistent volumes
By default, the Neo4j Helm chart uses a single PV, named data, to support volume mounts.
The volume used for each volume mount can be changed by modifying the volumes.<volume name> object in the Helm chart values.
The Neo4j Helm chart volumes object supports different modes, such as dynamic, share, defaultStorageClass, volume, selector, and volumeClaimTemplate.
You can also set a label on creation for the volumes with mode dynamic, defaultStorageClass, selector, and volumeClaimTemplate, which can be used to filter the PVs that are used for the volume mount.
mode: dynamicRecommended
- Description
-
Dynamic volumes are recommended for most production workloads due to ease of management. The volume mount is backed by a PV that Kubernetes dynamically provisions using a dedicated
StorageClass. TheStorageClassis specified in thestorageClassNamefield. - Example
-
The data volume uses a dedicated storage class:
storage-class-values.yamlneo4j: name: standalone-with-storage-class volumes: data: labels: data: "true" mode: dynamic dynamic: storageClassName: "neo4j-data" requests: storage: 10GiSee Provision a PV using a dedicated
StorageClassfor more information.
mode: share
- Description
-
The volume mount shares the underlying volume from one of the other volume objects.
- Example
-
The
logsvolume mount uses thedatavolume (this is the default behavior).volumes: logs: mode: "share" share: name: "data"
mode: defaultStorageClass
- Description
-
The volume mount is backed by a PV that Kubernetes dynamically provisions using the default
StorageClass. - Example
-
A dynamically provisioned
datavolume with a size of10Gi.volumes: data: labels: data: "true" mode: "defaultStorageClass" defaultStorageClass: requests: storage: 10GiFor the
datavolume, ifrequests.storageis not set,defaultStorageClassdefaults to a10Givolume. For all other volumes,defaultStorageClass.requests.storagemust be set explicitly when usingdefaultStorageClassmode.
mode: volume
- Description
-
A complete Kubernetes
volumeobject can be specified for the volume mount. Generally, volumes specified in this way have to be manually provisioned.volumecan be any valid Kubernetes volume type. This mode is typically used to mount a pre-existing Persistent Volume Claim (PVC).For details on how to specify
volumeobjects, see the Kubernetes documentation. - Set file permissions on mounted volumes
-
The Neo4j Helm chart supports an additional field not present in normal Kubernetes
volumeobjects:setOwnerAndGroupWritableFilePermissions: true|false. If set totrue, aninitContainerwill be run to modify the file permissions of the mounted volume, so that the contents can be written and read by the Neo4j process. This is to help with certain volume implementations that are not aware of theSecurityContextset on pods using them. - Example - reference an existing PersistentVolume
-
The
backupsvolume mount is backed by the specified PVC. When this method is used, thepersistentVolumeClaimobject must already exist.volumes: backups: mode: volume volume: persistentVolumeClaim: claimName: my-neo4j-pvc
mode: selector
- Description
-
The volume to use is chosen from the existing PVs based on the provided
selectorobject and a PVC that is dynamically generated.If no matching PVs exist, the Neo4j pod will be unable to start. To match, a PV must have the specified
StorageClass, match the labelselectorTemplate, and have sufficient storage capacity to meet the requested storage amount. - Example
-
The
datavolume is chosen from the available volumes with theneo4jstorage class and the labeldeveloper: alice.volumes: import: labels: import: "true" mode: selector selector: storageClassName: "neo4j" requests: storage: 128Gi selectorTemplate: matchLabels: developer: "alice"
|
For the |
mode: volumeClaimTemplate
- Description
-
A complete Kubernetes
volumeClaimTemplateobject is specified for the volume mount. Volumes specified in this way are dynamically provisioned. - Example - provision Neo4j storage using a volume claim template
-
The data volume uses a dynamically provisioned PVC from the
defaultstorage class.volumes: data: labels: data: "true" mode: volumeClaimTemplate volumeClaimTemplate: storageClassName: "default" accessModes: - ReadWriteOnce resources: requests: storage: 10GiIn all cases, do not forget to set the
modefield when customizing the volumes object. If not set, the defaultmodeis used, regardless of the other properties set on thevolumeobject.
Provision persistent volumes with Neo4j Helm chart
Provision persistent volumes dynamically
With the Neo4j Helm chart, you can provision a PV dynamically using the default or a custom StorageClass.
To see a list of available storage classes in your Kubernetes cluster, run the following command:
kubectl get storageclass
Provision a PV using a dedicated StorageClassRecommended
For production workloads, it is recommended to create a dedicated storage class for Neo4j, which uses the Retain reclaim policy.
This is to avoid data loss when disks are deleted after removing the persistent volume resource.
- Example: Deploy Neo4j using a dedicated
StorageClass -
The following example shows how to deploy a Neo4j server with a dynamically provisioned PV that uses a dedicated
storageClass.-
Create a dedicated storage class that uses the
Retainreclaim policy:-
Create a storage class in GKE that uses the
Retainreclaim policy andpd-ssdhigh-performance SSD disks:cat <<EOF | kubectl apply -f - apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: neo4j-data provisioner: pd.csi.storage.gke.io parameters: type: pd-ssd reclaimPolicy: Retain volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true EOF -
Check the storage class is created:
kubectl get storageclass neo4j-dataNAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE neo4j-data pd.csi.storage.gke.io Retain WaitForFirstConsumer true 7s
-
Create a storage class in EKS that uses the
Retainreclaim policy andgp3high-performance SSD disks:The EBS CSI Driver addon is required to provision EBS disks in EKS clusters. See the AWS documentation for instructions on installing the driver.
cat <<EOF | kubectl apply -f - kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: neo4j-data provisioner: ebs.csi.aws.com parameters: type: gp3 reclaimPolicy: Retain allowVolumeExpansion: true volumeBindingMode: WaitForFirstConsumer EOF -
Check the storage class is created:
kubectl get storageclass neo4j-dataNAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE neo4j-data ebs.csi.aws.com Retain WaitForFirstConsumer true 2m41s
-
Create a storage class in AKS that uses the
Retainreclaim policy andpd-ssdhigh-performance SSD disks:cat <<EOF | kubectl apply -f - apiVersion: storage.k8s.io/v1 kind: StorageClass metadata: name: neo4j-data provisioner: disk.csi.azure.com parameters: skuName: Premium_LRS reclaimPolicy: Retain volumeBindingMode: WaitForFirstConsumer allowVolumeExpansion: true EOF -
Check the storage class is created:
kubectl get storageclass neo4j-dataNAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE neo4j-data disk.csi.azure.com Retain WaitForFirstConsumer true 7s
-
-
Install a Neo4j server with a data volume that uses the new storage class:
-
Create a file storage-class-values.yaml that configures the data volume to use the new storage class:
storage-class-values.yamlneo4j: name: standalone-with-storage-class volumes: data: mode: dynamic dynamic: storageClassName: "neo4j-data" requests: storage: 10Gi -
Install a single Neo4j server:
helm install standalone-with-storage-class neo4j -f storage-class-values.yaml -
When the installation completes, verify that a PVC has been created:
kubectl get pvcNAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE data-standalone-with-storage-class-0 Bound pvc-5d400f06-f99f-43ac-bf37-6079d692eaac 10Gi RWO neo4j-data 23m
-
-
Clean up the resources:
The storage class uses the
Retainretention policy, meaning the disk will not be deleted after removing the PVC. To delete the disk, patch the PVC to use theDeleteretention policy and delete the PVC:export pv_name=$(kubectl get pvc data-standalone-with-storage-class-0 -o jsonpath='{.spec.volumeName}') kubectl patch pv $pv_name -p '{"spec":{"persistentVolumeReclaimPolicy": "Delete"}}' kubectl delete pvc data-standalone-with-storage-class-0For the
datavolume, ifrequests.storageis not set,dynamicdefaults to a100Givolume. For all other volumes,dynamic.requests.storagemust be set explicitly when usingdynamicmode.
-
Provision a PV using defaultStorageClass
Using the default StorageClass of the running Kubernetes cluster is the quickest way to spin up and run Neo4j for simple tests, handling small amounts of data.
However, it is not recommended for large amounts of data, as it may lead to performance issues.
- Example: Deploy Neo4j using
defaultStorageClass -
The following example shows how to deploy a Neo4j server with a dynamically provisioned PV that uses the default
StorageClass.-
Create a file default-storage-class-values.yaml that configures the data volume to use the default
StorageClassand a storage size100Gi:storage-class-values.yamlvolumes: data: mode: "defaultStorageClass" defaultStorageClass: requests: storage: 100Gi -
Install a single Neo4j server:
helm install standalone-with-default-storage-class neo4j -f default-storage-class-values.yaml
-
Provision persistent volumes manually
Optionally, the Helm chart can use manually created disks for Neo4j storage. This installation option has more steps than using dynamic volumes, but it does provide more control over how disks are provisioned.
The instructions for the manual provisioning of PVs vary according to the type of PV being used and the underlying infrastructure. In general, there are two steps:
-
Create the disk/volume to be used for storage in the underlying infrastructure. For example:
-
If using a
csivolume — create the Persistent Disk using the cloud provider CLI or console. -
If using a
hostPathvolume — on the host node, create the path (directory).
-
-
Create a PV in Kubernetes that references the underlying resource created in step 1.
-
Ensure that the created PV’s
applabel matches the name of the Neo4j Helm release. -
Ensure that the created PV’s
capacity.storagematches the storage available on the underlying infrastructure.
-
If no suitable PV or PVC exists, the Neo4j pod will not start.
Provision a PV for Neo4j Storage using a PV selector
The Neo4j StatefulSet can select a persistent volume to use based on its labels. A Neo4j Helm release uses only manually provisioned PVs that have:
-
storageClassNamethat uses the provisionerkubernetes.io/no-provisioner. -
An
applabel — set in their metadata, which matches the name of theneo4j.namevalue of the Helm installation. -
Sufficient storage capacity — the PV capacity must be greater than or equal to the value of
volumes.data.selector.requests.storageset for the Neo4j Helm release (default is100Gi).
|
The neo4j/neo4j-persistent-volume Helm chart provides a convenient way to provision the persistent volume. |
- Example: Deploy Neo4j using a selector volume
-
The following example shows how to deploy Neo4j using a selector volume.
-
Create a file persistent-volume-selector.yaml that configures the data volume to use a selector:
storage-class-values.yamlneo4j: name: volume-selector volumes: data: mode: selector selector: storageClassName: "manual" accessModes: - ReadWriteOnce requests: storage: 10Gi -
Export environment variables to be used by the commands:
export RELEASE_NAME=volume-selector export GCP_ZONE="$(gcloud config get compute/zone)" export GCP_PROJECT="$(gcloud config get project)" -
Create the disks to be used by the persistent volume:
gcloud compute disks create --size 10Gi --type pd-ssd "${RELEASE_NAME}" -
Use the neo4j/neo4j-persistent-volume chart to configure the persistent volume. This command will create a persistent volume and a manual storage class that uses the
kubernetes.io/no-provisionerprovisioner.helm install "${RELEASE_NAME}"-disk neo4j/neo4j-persistent-volume \ --set neo4j.name="${RELEASE_NAME}" \ --set data.driver=pd.csi.storage.gke.io \ --set data.storageClassName="manual" \ --set data.reclaimPolicy="Delete" \ --set data.createPvc=false \ --set data.createStorageClass=true \ --set data.volumeHandle="projects/${GCP_PROJECT}/zones/${GCP_ZONE}/disks/${RELEASE_NAME}" \ --set data.capacity.storage=10Gi -
Now install Neo4j using the
persistent-volume-selector.yamlcreated earlier:helm install "${RELEASE_NAME}" neo4j/neo4j -f persistent-volume-selector.yaml -
Clean up the helm installation and disks created for the example:
helm uninstall ${RELEASE_NAME} ${RELEASE_NAME}-disk kubectl delete pvc data-${RELEASE_NAME}-0 gcloud compute disks delete ${RELEASE_NAME} --quiet
The EBS CSI Driver addon is required to provision EBS disks in EKS clusters. You can run the command
kubectl get daemonset ebs-csi-node -n kube-systemto check if it is installed See the AWS Documentation for instructions on installing the driver.-
Create a file
persistent-volume-selector.yamlthat configures the data volume to use a selector:storage-class-values.yamlneo4j: name: volume-selector volumes: data: mode: selector selector: storageClassName: "manual" accessModes: - ReadWriteOnce requests: storage: 10Gi -
Export environment variables to be used by the commands:
readonly RELEASE_NAME=volume-selector readonly AWS_ZONE={availability zone of EKS cluster} -
Create the disks to be used by the persistent volume:
export volumeId=$(aws ec2 create-volume \ --availability-zone="${AWS_ZONE}" \ --size=10 \ --volume-type=gp3 \ --tag-specifications 'ResourceType=volume,Tags=[{Key=volume,Value='"${RELEASE_NAME}"'}]' \ --no-cli-pager \ --output text \ --query VolumeId) -
Use the neo4j/neo4j-persistent-volume chart to configure the persistent volume. This command will create a persistent volume and a manual storage class that uses the
kubernetes.io/no-provisionerprovisioner.helm install "${RELEASE_NAME}"-disk neo4j-persistent-volume \ --set neo4j.name="${RELEASE_NAME}" \ --set data.driver=ebs.csi.aws.com \ --set data.reclaimPolicy="Delete" \ --set data.createPvc=false \ --set data.createStorageClass=true \ --set data.volumeHandle="${volumeId}" \ --set data.capacity.storage=10Gi -
Now install Neo4j using the
persistent-volume-selector.yamlcreated earlier:helm install "${RELEASE_NAME}" neo4j/neo4j -f persistent-volume-selector.yaml -
Clean up the helm installation and disks created for the example:
helm uninstall ${RELEASE_NAME} ${RELEASE_NAME}-disk kubectl delete pvc data-${RELEASE_NAME}-0 aws ec2 delete-volume --volume-id ${volumeId}
-
Create a file
persistent-volume-selector.yamlthat configures the data volume to use a selector:storage-class-values.yamlneo4j: name: volume-selector volumes: data: mode: selector selector: storageClassName: "manual" accessModes: - ReadWriteOnce requests: storage: 10Gi -
Export environment variables to be used by the commands:
readonly AKS_CLUSTER_NAME={AKS Cluster name} readonly AZ_RESOURCE_GROUP={Resource group of cluster} readonly AZ_LOCATION={Location of cluster} -
Create the disks to be used by the persistent volume:
export node_resource_group=$(az aks show --resource-group "${AZ_RESOURCE_GROUP}" --name "${AKS_CLUSTER_NAME}" --query nodeResourceGroup -o tsv) export disk_id=$(az disk create --name "${RELEASE_NAME}" --size-gb "10" --max-shares 1 --resource-group "${node_resource_group}" --location ${AZ_LOCATION} --output tsv --query id) -
Use the neo4j/neo4j-persistent-volume chart to configure the persistent volume. This command will create a persistent volume and a manual storage class that uses the
kubernetes.io/no-provisionerprovisioner.helm install "${RELEASE_NAME}"-disk neo4j-persistent-volume \ --set neo4j.name="${RELEASE_NAME}" \ --set data.driver=disk.csi.azure.com \ --set data.storageClassName="manual" \ --set data.reclaimPolicy="Delete" \ --set data.createPvc=false \ --set data.createStorageClass=true \ --set data.volumeHandle="${disk_id}" \ --set data.capacity.storage=10Gi -
Now install Neo4j using the
persistent-volume-selector.yamlcreated earlier:helm install "${RELEASE_NAME}" neo4j/neo4j -f persistent-volume-selector.yaml -
Clean up the helm installation and disks created for the example:
helm uninstall ${RELEASE_NAME} ${RELEASE_NAME}-disk kubectl delete pvc data-${RELEASE_NAME}-0 az disk delete --name ${RELEASE_NAME} -y
-
Provision a PVC for Neo4j Storage
An alternative method for manual provisioning is to use a manually provisioned PVC.
This is supported by the Neo4j Helm chart using the volume mode.
The neo4j/neo4j-persistent-volume Helm chart can be used to create a PV and PVC for a manually provisioned disk.
A full example can be found in the Neo4j GitHub repository.
For example, to use a pre-existing PVC called my-neo4j-pvc set these values:
volumes:
data:
mode: "volume"
volume:
persistentVolumeClaim:
claimName: my-neo4j-pvc
Reuse a persistent volume
After uninstalling the Neo4j Helm chart, both the PVC and the PV remain and can be reused by a new install of the Helm chart.
If you delete the PVC, the PV moves into a Released status and will not be reusable.
To be able to reuse the PV by a new install of the Neo4j Helm chart, remove its connection to the previous PVC:
-
Edit the PV by running the following command:
kubectl edit pv <pv-name> -
Remove the section
spec.claimRef.
The PV goes back to theAvailablestatus and can be reused by a new install of the Neo4j Helm chart.
|
The performance of Neo4j is very dependent on the latency, IOPS capacity, and throughput of the storage it is using. For the best performance of Neo4j, use the best available disks (e.g., SSD) and set IOPS throttling/quotas to high values. For some cloud providers, IOPS throttling is proportional to the size of the volume. In these cases, the best performance is achieved by setting the size of the volume based on the desired IOPS rather than the amount required for data storage. |
hostPath volumes.