Configure Apache Arrow server
GDS supports importing graphs and exporting properties via Apache Arrow Flight. This chapter is dedicated to configuring the Arrow Flight Server as part of the Neo4j and GDS installation. For using Arrow Flight with an Arrow client, please refer to our documentation for projecting graphs and streaming properties.
The simplest way to use Arrow is through our Neo4j Graph Data Science Client, which uses Arrow by default if available.
Arrow is bundled with GDS Enterprise Edition which must be installed.
Installation
Arrow is installed by default on Neo4j AuraDS. |
On a standalone Neo4j Server, Arrow needs to be explicitly enabled and configured.
The Flight Server is disabled by default, to enable it, add the following to your $NEO4J_HOME/conf/neo4j.conf
file:
gds.arrow.enabled=true
The following additional settings are available:
Name | Default | Optional | Description |
---|---|---|---|
|
|
Yes |
This setting specifies how the Arrow Flight Server listens for incoming connections. It consists of two parts; an IP address (e.g. 127.0.0.1 or 0.0.0.0) and a port number (e.g. 7687), and is expressed in the format <ip-address>:<port-number>. |
|
|
Yes |
This setting specifies the address that clients should use for connecting to the Arrow Flight Server. This is useful if the server runs behind a proxy that forwards the advertised address to an internal address. The advertised address consists of two parts; an address (fully qualified domain name, hostname, or IP address) and a port number (e.g. 8491), and is expressed in the format <address>:<port-number>. |
|
|
Yes |
The maximum time in minutes to wait for the next command before aborting the import process. |
|
|
Yes |
The batch size used for arrow property export. |
Note, that any change to the configuration requires a database restart.
You can run CALL gds.debug.arrow()
to check that Arrow is available.
Authentication
Client connections to the Arrow Flight server are authenticated using the Neo4j native auth provider. Any authenticated user can perform all available Arrow operations, i.e., graph projection and property streaming. There are no dedicated roles to configure.
To enable authentication, use the following DBMS setting:
dbms.security.auth_enabled=true
Encryption
Communication between client and server can optionally be encrypted.
The Arrow Flight server is re-using the Neo4j native SSL framework.
In terms of configuration scope, the Arrow Server supports https
and bolt
.
If both scopes are configured, the Arrow Server prioritizes the https
scope.
To enable encryption for https
, use the following DBMS settings:
dbms.ssl.policy.https.enabled=true dbms.ssl.policy.https.private_key=private.key dbms.ssl.policy.https.public_certificate=public.crt
It is currently not possible to use a certificate where the private key is protected by a password. Such a certificate can be used to secure Neo4j. For Arrow Flight, only certificates with a password-less private key are accepted. |
Flight server encryption can also be deactivated, even if it is configured for Neo4j. To disable encryption, use the following settings:
gds.arrow.encryption.never=true
The setting can only used to deactivate encryption for the GDS Flight server. It cannot be used to deactivate encryption for the Neo4j server. It cannot be used to activate encryption for the GDS Flight server if the Neo4j server has no encryption configured.
Monitoring
To return details about the status of the GDS Flight server, GDS provides the gds.debug.arrow
procedure.
CALL gds.debug.arrow()
YIELD
running: Boolean,
enabled: Boolean,
listenAddress: String,
batchSize: Integer,
abortionTimeout: Integer
Name | Type | Description |
---|---|---|
running |
Boolean |
True, if the Arrow Flight Server is currently running. |
enabled |
Boolean |
True, if the corresponding setting is enabled. |
versions |
List |
A list of supported command versions (e.g. |
listenAddress |
String |
The address (host and port) the Arrow Flight Client should connect to. |
batchSize |
Integer |
The batch size used for arrow property export. |
abortionTimeout |
Duration |
The maximum time to wait for the next command before aborting the import process. |
advertisedListenAddress |
String |
DEPRECATED: Same as |
serverLocation |
String |
DEPRECATED: Always |
Versioning
All features that the GDS Arrow Flight server exposes are versioned. This allows us to make changes to existing features, introduce new ones or remove deprecated ones without breaking existing clients. The versioning scheme is applied to the commands that the client sends to the server. A command is a GDS-specific abstraction over Arrow Flight Actions, Descriptors and Tickets.
Commands are sent by the client as UTF-8-encoded JSON documents. Each command is associated with additional meta-data, such as the version of the command.
{ name: "MY_COMMAND", version: "v1", body: { ... } }
The only exception from that are Flight Actions, where the version is part of the action type.
The version is always at the beginning of the action type, separated by a forward slash (/
).
Action type: V1/CREATE_GRAPH Action body: { ... }
All available actions can be requested from the GDS Arrow Flight Server by using the LIST_ACTIONS
endpoint.
Up until GDS 2.6, commands were not versioned as GDS Arrow features were still in alpha. In GDS 2.6, the GDS Arrow server supports both, versioned and prior alpha commands. Alpha commands are considered deprecated for deletion and will be removed in a future release. |