New in Neo4j AuraDB: Direct Import From Cloud Data Warehouses

John Stegeman

Graph Database Product Specialist, Neo4j

Direct Import From Cloud Data Warehouses

We’re adding support for cloud data warehouses, including Snowflake, Databricks, and BigQuery, to the Import Service in Neo4j AuraDB. This will enable you to bring your data into Neo4j more easily. The Import Service already supports importing directly from CSV files and relational databases, using available primary and foreign key definitions to infer a starting graph data model. However, cloud data warehouses often lack such key definitions, so we’re launching a new generative AI feature that inspects the warehouse schema, identifies likely primary and foreign keys, and uses them to generate a candidate graph data model. This feature will also work with CSV files that lack primary and foreign key information.

Neo4j customers can use the new features to quickly design a graph data model and import data from existing sources within their enterprise data ecosystem, simplifying the task of managing, analyzing, and gaining insight from today’s highly connected data.

Here’s a quick walkthrough of the new capability:

1. Start by defining a new data source in the Aura Console (under the Import Service):

2. Next, depending on the data source type, supply the connection information, credentials, and a name for your data source. For example, here’s the information needed to connect to Snowflake and load a synthetic health dataset generated using synthea:

The Import Service will use JDBC to connect to your relational sources. In this initial release, the databases must allow inbound connections from internet sources. In future releases, we plan to enable the Import Service to connect from known, static IP addresses to support firewall restrictions for inbound connections. Once connected, you’ll see a preview of the available tables for confirmation: 

3. After you’ve created a data source, you can make a graph model. In addition to the option to define your model manually, the Import Service can automatically generate a model from the data source’s tables, using primary and foreign keys (if available) to infer relationships. If such information isn’t available, you can also use the AI option to estimate likely keys and generate a corresponding model.

4. After selecting “Generate with AI,” the first draft looks like this:

Especially with AI-generated models, it’s essential to review them to ensure they correspond to your intent. After reviewing the above model, I chose to remove parts of the model that weren’t necessary for my particular use and correct a few other parts of the model, leaving me with this model, ready to import into my database:

Running import job:

5. Once the data is imported, you can navigate to the Query or Explore tools in the Aura Console to start working with your data. A great way to start exploring is to use the generative AI capabilities of the Query tool to ask questions in plain English. AuraDB will use its understanding of your data model to translate your question into Cypher, the Neo4j query language:

We’re excited to see how you use these new capabilities to integrate Neo4j AuraDB with your existing data ecosystem and generate insights from understanding complex relationships in your data. We’re continually adding new features to the import service, and we’re currently working on the following:

  • Exposing static IP addresses used by the Import Service to permit whitelisting for source database firewalls
  • Support for imports of CSV and parquet files from user cloud buckets

These new features are available today for Snowflake, with BigQuery and Databricks planned to be available soon. Please don’t hesitate to contact us with any questions about the new features, and if you’d like to try them out for yourself, head over to the AuraDB product page and create a free AuraDB account today.