Export to GraphML
The export GraphML procedures export data into a format that is used by other tools like Gephi and CytoScape to read graph data.
The GraphML does not support the full flexibility of the property graph data model. The two main restriction are:
-
GraphML assumes that all node or relationship properties with the same property name have property values of the same type. Neo4j allows mixed types per property name. If a property name has a value of mixed value types, then its values are exported as a
STRING
. -
GraphML supports fewer value types than Neo4j. Hence, all
POINT
and temporal values are exported formatted as aSTRING
.e.g:
Point 3d
{"crs":"wgs-84-3d","latitude":56.7,"longitude":12.78,"height":100.0}
Point 2d
{"crs":"wgs-84-3d","latitude":56.7,"longitude":12.78,"height":null}
Date
2018-10-10
LocalDateTime
2018-10-10T00:00
Note that, to perform a correct Point serialization, it is not recommended to export a point with coordinates x,y and crs: 'wgs-84', for example
point({x: 56.7, y: 12.78, crs: 'wgs-84'})
. Otherwise, the point will be exported with longitude and latitude (and height) instead of x and y (and z)
Available Procedures
The table below describes the available procedures:
Qualified Name | Type |
---|---|
apoc.export.graphml.all |
|
apoc.export.graphml.data |
|
apoc.export.graphml.graph |
|
apoc.export.graphml.query |
|
The labels exported are ordered alphabetically.
The output of labels()
function is not sorted, use it in combination with apoc.coll.sort()
.
Configuration parameters
The procedures support the following config parameters:
param | default | description |
---|---|---|
format |
gephi |
In export to Graphml script define the export format. Possible value are: "gephi" and "tinkerpop" |
caption |
It’s an array of string (i.e. ['name','title']) that define an ordered set of properties eligible as value for the |
|
useTypes |
false |
Write the property type information to the graphml output. Note that the values of properties with the same property name have to be of the same type. If a node property name is associated with property value of different types, e.g. as in |
batchSize |
20000 |
define the batch size |
delim |
"," |
define the delimiter character (export csv) |
arrayDelim |
";" |
define the delimiter character for arrays (used in the bulk import) |
storeNodeIds |
false |
set nodes' ids (import/export graphml) |
readLabels |
false |
read nodes' labels (import/export graphml) |
defaultRelationshipType |
"RELATED" |
set relationship type (import/export graphml) |
separateFiles |
false |
export results in separated files by type (nodes, relationships) |
stream |
false |
stream the xml directly to the client into the |
source |
Empty map |
To be used together with |
target |
Empty map |
Same as |
Exporting to a file
By default exporting to the file system is disabled.
We can enable it by setting the following property in apoc.conf
:
apoc.export.file.enabled=true
For more information about accessing apoc.conf
, see the chapter on Configuration Options.
If we try to use any of the export procedures without having first set this property, we’ll get the following error message:
Failed to invoke procedure: Caused by: java.lang.RuntimeException: Export to files not enabled, please set apoc.export.file.enabled=true in your apoc.conf.
Otherwise, if you are running in a cloud environment without filesystem access, use the |
Export files are written to the import
directory, which is defined by the server.directories.import
property.
This means that any file path that we provide is relative to this directory.
If we try to write to an absolute path, such as /tmp/filename
, we’ll get an error message similar to the following one:
Failed to invoke procedure: Caused by: java.io.FileNotFoundException: /path/to/neo4j/import/tmp/fileName (No such file or directory) |
We can enable writing to anywhere on the file system by setting the following property in apoc.conf
:
apoc.import.file.use_neo4j_config=false
Neo4j will now be able to write anywhere on the file system, so be sure that this is your intention before setting this property. |
Exporting a stream
If we don’t want to export to a file, we can stream results back in the data
column instead by passing a file name of null
and providing the stream:true
config.
Examples
This section includes examples showing how to use the export to Cypher procedures. These examples are based on a movies dataset, which can be imported by running the following Cypher query:
CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
CREATE (Carrie:Person {name:'Carrie-Anne Moss', born:1967})
CREATE (Laurence:Person {name:'Laurence Fishburne', born:1961})
CREATE (Hugo:Person {name:'Hugo Weaving', born:1960})
CREATE (LillyW:Person {name:'Lilly Wachowski', born:1967})
CREATE (LanaW:Person {name:'Lana Wachowski', born:1965})
CREATE (JoelS:Person {name:'Joel Silver', born:1952})
CREATE
(Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrix),
(Carrie)-[:ACTED_IN {roles:['Trinity']}]->(TheMatrix),
(Laurence)-[:ACTED_IN {roles:['Morpheus']}]->(TheMatrix),
(Hugo)-[:ACTED_IN {roles:['Agent Smith']}]->(TheMatrix),
(LillyW)-[:DIRECTED]->(TheMatrix),
(LanaW)-[:DIRECTED]->(TheMatrix),
(JoelS)-[:PRODUCED]->(TheMatrix);
The Neo4j Browser visualization below shows the imported graph:
Round trip separated GraphML files
With this dataset:
CREATE (f:Foo:Foo2:Foo0 {name:'foo', born:Date('2018-10-10'), place:point({ longitude: 56.7, latitude: 12.78, height: 100 })})-[:KNOWS]->(b:Bar {name:'bar',age:42, place:point({ longitude: 56.7, latitude: 12.78})});
CREATE (:Foo {name: 'zzz'})-[:KNOWS]->(:Bar {age: 0});
CREATE (:Foo {name: 'aaa'})-[:KNOWS {id: 1}]->(:Bar {age: 666});
we can execute these 3 export queries:
// Foo nodes
call apoc.export.graphml.query('MATCH (start:Foo)-[:KNOWS]->(:Bar) RETURN start', 'queryNodesFoo.graphml', {useTypes: true});
// Bar nodes
call apoc.export.graphml.query('MATCH (:Foo)-[:KNOWS]->(end:Bar) RETURN end', 'queryNodesBar.graphml', {useTypes: true});
// KNOWS rels
MATCH (:Foo)-[rel:KNOWS]->(:Bar)
WITH collect(rel) as rels
call apoc.export.graphml.data([], rels, 'queryRelationship.graphml', {useTypes: true})
YIELD nodes, relationships RETURN nodes, relationships;
In this case we will have these 3 files: .queryNodesFoo.graphml
<?xml version='1.0' encoding='UTF-8'?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="born" for="node" attr.name="born" attr.type="string"/>
<key id="name" for="node" attr.name="name" attr.type="string"/>
<key id="place" for="node" attr.name="place" attr.type="string"/>
<key id="labels" for="node" attr.name="labels" attr.type="string"/>
<graph id="G" edgedefault="directed">
<node id="n0" labels=":Foo:Foo0:Foo2"><data key="labels">:Foo:Foo0:Foo2</data><data key="born">2018-10-10</data><data key="name">foo</data><data key="place">{"crs":"wgs-84-3d","latitude":12.78,"longitude":56.7,"height":100.0}</data></node>
<node id="n3" labels=":Foo"><data key="labels">:Foo</data><data key="name">zzz</data></node>
<node id="n5" labels=":Foo"><data key="labels">:Foo</data><data key="name">aaa</data></node>
</graph>
</graphml>
<?xml version='1.0' encoding='UTF-8'?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="name" for="node" attr.name="name" attr.type="string"/>
<key id="place" for="node" attr.name="place" attr.type="string"/>
<key id="age" for="node" attr.name="age" attr.type="long"/>
<key id="labels" for="node" attr.name="labels" attr.type="string"/>
<graph id="G" edgedefault="directed">
<node id="n1" labels=":Bar"><data key="labels">:Bar</data><data key="name">bar</data><data key="age">42</data><data key="place">{"crs":"wgs-84","latitude":12.78,"longitude":56.7,"height":null}</data></node>
<node id="n4" labels=":Bar"><data key="labels">:Bar</data><data key="age">0</data></node>
<node id="n6" labels=":Bar"><data key="labels">:Bar</data><data key="age">666</data></node>
</graph>
</graphml>
<?xml version='1.0' encoding='UTF-8'?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="label" for="edge" attr.name="label" attr.type="string"/>
<key id="id" for="edge" attr.name="id" attr.type="long"/>
<graph id="G" edgedefault="directed">
<edge id="e0" source="n0" target="n1" label="KNOWS"><data key="label">KNOWS</data></edge>
<edge id="e1" source="n3" target="n4" label="KNOWS"><data key="label">KNOWS</data></edge>
<edge id="e2" source="n5" target="n6" label="KNOWS"><data key="label">KNOWS</data><data key="id">1</data></edge>
</graph>
</graphml>
So we can import, in another db, in this way, to recreate the original dataset, using these queries:
CALL apoc.import.graphml('queryNodesFoo.graphml', {readLabels: true, storeNodeIds: true});
CALL apoc.import.graphml('queryNodesBar.graphml', {readLabels: true, storeNodeIds: true});
CALL apoc.import.graphml('queryRelationship.graphml', {readLabels: true, source: {label: 'Foo'}, target: {label: 'Bar'}});
Note that we have to execute the import of nodes before,
and we used the useTypes: true
to import the attribute id
of node
tags as a property and readLabels
to populate nodes with labels.
With custom property key
Otherwise, we can leverage a custom property and avoid importing the id
attribute (via useTypes:true
)
in this way (same dataset and nodes export query as before):
// KNOWS rels
MATCH (:Foo)-[rel:KNOWS]->(:Bar)
WITH collect(rel) as rels
call apoc.export.graphml.data([], rels, 'queryRelationship.graphml',
{useTypes: true, source: {id: 'name'}, label: {id: 'age'}})
YIELD nodes, relationships RETURN nodes, relationships;
Is strongly recommended using an unique constraint to ensure uniqueness,
so in this case for label Foo
and property name
and for label Bar
and property age
The above query generate this rel file:
<?xml version='1.0' encoding='UTF-8'?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="label" for="edge" attr.name="label" attr.type="string"/>
<key id="id" for="edge" attr.name="id" attr.type="long"/>
<graph id="G" edgedefault="directed">
<edge id="e0" source="foo" sourceType="string" target="42" targetType="long" label="KNOWS"><data key="label">KNOWS</data></edge>
<edge id="e1" source="zzz" sourceType="string" target="0" targetType="long" label="KNOWS"><data key="label">KNOWS</data></edge>
<edge id="e2" source="aaa" sourceType="string" target="666" targetType="long" label="KNOWS"><data key="label">KNOWS</data><data key="id">1</data></edge>
</graph>
</graphml>
Finally, we can import the files using the same id (name and age) as above:
CALL apoc.import.graphml('queryNodesFoo.graphml', {readLabels: true});
CALL apoc.import.graphml('queryNodesBar.graphml', {readLabels: true});
CALL apoc.import.graphml('queryRelationship.graphml',
{readLabels: true, source: {label: 'Foo', id: 'name'}, target: {label: 'Bar', id: 'age'}});
Export whole database to GraphML
The apoc.export.graphml.all
procedure exports the whole database to a GraphML file or as a stream.
movies.graphml
CALL apoc.export.graphml.all("movies.graphml", {})
file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies.graphml" |
"database: nodes(8), rels(7)" |
"graphml" |
8 |
7 |
21 |
4 |
15 |
-1 |
0 |
TRUE |
NULL |
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="born" for="node" attr.name="born"/>
<key id="name" for="node" attr.name="name"/>
<key id="tagline" for="node" attr.name="tagline"/>
<key id="label" for="node" attr.name="label"/>
<key id="title" for="node" attr.name="title"/>
<key id="released" for="node" attr.name="released"/>
<key id="roles" for="edge" attr.name="roles"/>
<key id="label" for="edge" attr.name="label"/>
<graph id="G" edgedefault="directed">
<node id="n188" labels=":Movie"><data key="labels">:Movie</data><data key="title">The Matrix</data><data key="tagline">Welcome to the Real World</data><data key="released">1999</data></node>
<node id="n189" labels=":Person"><data key="labels">:Person</data><data key="born">1964</data><data key="name">Keanu Reeves</data></node>
<node id="n190" labels=":Person"><data key="labels">:Person</data><data key="born">1967</data><data key="name">Carrie-Anne Moss</data></node>
<node id="n191" labels=":Person"><data key="labels">:Person</data><data key="born">1961</data><data key="name">Laurence Fishburne</data></node>
<node id="n192" labels=":Person"><data key="labels">:Person</data><data key="born">1960</data><data key="name">Hugo Weaving</data></node>
<node id="n193" labels=":Person"><data key="labels">:Person</data><data key="born">1967</data><data key="name">Lilly Wachowski</data></node>
<node id="n194" labels=":Person"><data key="labels">:Person</data><data key="born">1965</data><data key="name">Lana Wachowski</data></node>
<node id="n195" labels=":Person"><data key="labels">:Person</data><data key="born">1952</data><data key="name">Joel Silver</data></node>
<edge id="e267" source="n189" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Neo"]</data></edge>
<edge id="e268" source="n190" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Trinity"]</data></edge>
<edge id="e269" source="n191" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Morpheus"]</data></edge>
<edge id="e270" source="n192" target="n188" label="ACTED_IN"><data key="label">ACTED_IN</data><data key="roles">["Agent Smith"]</data></edge>
<edge id="e271" source="n193" target="n188" label="DIRECTED"><data key="label">DIRECTED</data></edge>
<edge id="e272" source="n194" target="n188" label="DIRECTED"><data key="label">DIRECTED</data></edge>
<edge id="e273" source="n195" target="n188" label="PRODUCED"><data key="label">PRODUCED</data></edge>
</graph>
</graphml>
data
columnCALL apoc.export.graphml.all(null, {stream:true})
YIELD file, nodes, relationships, properties, data
RETURN file, nodes, relationships, properties, data;
file | nodes | relationships | properties | data |
---|---|---|---|---|
|
|
|
|
|
Export specified nodes and relationships to GraphML
The apoc.export.graphml.data
procedure exports the specified nodes and relationships to a CSV file or as a stream.
:Person
label with a name
property that starts with L
to the file movies-l.csv
MATCH (person:Person)
WHERE person.name STARTS WITH "L"
WITH collect(person) AS people
CALL apoc.export.graphml.data(people, [], "movies-l.graphml", {})
YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies-l.csv" |
"data: nodes(3), rels(0)" |
"csv" |
3 |
0 |
6 |
2 |
3 |
20000 |
1 |
TRUE |
NULL |
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="born" for="node" attr.name="born"/>
<key id="name" for="node" attr.name="name"/>
<key id="label" for="node" attr.name="label"/>
<graph id="G" edgedefault="directed">
<node id="n191" labels=":Person"><data key="labels">:Person</data><data key="born">1961</data><data key="name">Laurence Fishburne</data></node>
<node id="n193" labels=":Person"><data key="labels">:Person</data><data key="born">1967</data><data key="name">Lilly Wachowski</data></node>
<node id="n194" labels=":Person"><data key="labels">:Person</data><data key="born">1965</data><data key="name">Lana Wachowski</data></node>
</graph>
</graphml>
ACTED_IN
relationships and the nodes with Person
and Movie
labels on either side of that relationship in the data
columnMATCH (person:Person)-[actedIn:ACTED_IN]->(movie:Movie)
WITH collect(DISTINCT person) AS people, collect(DISTINCT movie) AS movies, collect(actedIn) AS actedInRels
CALL apoc.export.graphml.data(people + movies, actedInRels, null, {stream: true})
YIELD file, nodes, relationships, properties, data
RETURN file, nodes, relationships, properties, data;
file | nodes | relationships | properties | data |
---|---|---|---|---|
|
|
|
|
|
Export results of Cypher query to GraphML
The apoc.export.graphml.query
procedure exports the results of a Cypher query to a CSV file or as a stream.
DIRECTED
relationships and the nodes with Person
and Movie
labels on either side of that relationship to the file movies-directed.graphml
WITH "MATCH path = (person:Person)-[directed:DIRECTED]->(movie)
RETURN person, directed, movie" AS query
CALL apoc.export.graphml.query(query, "movies-directed.graphml", {})
YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data;
file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data |
---|---|---|---|---|---|---|---|---|---|---|---|
"movies-directed.graphml" |
"statement: nodes(3), rels(2)" |
"graphml" |
3 |
2 |
7 |
2 |
5 |
-1 |
0 |
TRUE |
NULL |
The contents of movies-directed.csv
are shown below:
<?xml version="1.0" encoding="UTF-8"?>
<graphml xmlns="http://graphml.graphdrawing.org/xmlns" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://graphml.graphdrawing.org/xmlns http://graphml.graphdrawing.org/xmlns/1.0/graphml.xsd">
<key id="born" for="node" attr.name="born"/>
<key id="name" for="node" attr.name="name"/>
<key id="tagline" for="node" attr.name="tagline"/>
<key id="label" for="node" attr.name="label"/>
<key id="title" for="node" attr.name="title"/>
<key id="released" for="node" attr.name="released"/>
<key id="label" for="edge" attr.name="label"/>
<graph id="G" edgedefault="directed">
<node id="n188" labels=":Movie"><data key="labels">:Movie</data><data key="title">The Matrix</data><data key="tagline">Welcome to the Real World</data><data key="released">1999</data></node>
<node id="n193" labels=":Person"><data key="labels">:Person</data><data key="born">1967</data><data key="name">Lilly Wachowski</data></node>
<node id="n194" labels=":Person"><data key="labels">:Person</data><data key="born">1965</data><data key="name">Lana Wachowski</data></node>
<edge id="e271" source="n193" target="n188" label="DIRECTED"><data key="label">DIRECTED</data></edge>
<edge id="e272" source="n194" target="n188" label="DIRECTED"><data key="label">DIRECTED</data></edge>
</graph>
</graphml>
DIRECTED
relationships and the nodes with Person
and Movie
labels on either side of that relationshipWITH "MATCH path = (person:Person)-[directed:DIRECTED]->(movie)
RETURN person, directed, movie" AS query
CALL apoc.export.graphml.query(query, null, {stream: true})
YIELD file, nodes, relationships, properties, data
RETURN file, nodes, relationships, properties, data;
file | nodes | relationships | properties | data |
---|---|---|---|---|
|
|
|
|
|
You can also compress the files to export. See here for more information |