Release Date: 9 September 2021
GDS 1.7.0-preview is compatible with Neo4j 4.1, 4.2, and 4.3 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.0 compatible release, please see GDS 1.1.6
Breaking changes
- This release does not support Neo4j 4.0.x
- Align returned
modelInfo
entry names ofgds.alpha.ml.linkPrediction.train
andgds.alpha.ml.nodeClassification.train
with the model catalog. Now containingmodelName
andmodelInfo
instead ofname
andinfo
. - Remove the
sharedUpdater
parameter fromgds.alpha.ml.linkPrediction
andgds.alpha.ml.nodeClassification
. gds.beta.graph.export.csv
now exports into a subdirectory calledexport
. Previously, the exported graphs were written directly into the configured directory.- Renamed all
graphalgo
packages togds
New features
- New Algorithm: Approximate Maximum K-Cut
- Includes procedures:
gds.alpha.maxkcut.[mutate|mutate.estimate|stream|stream.estimate]
.
- Includes procedures:
- Introduced Link Prediction Pipelines to make it easier to define and calculate features, split your graph, and make predictions.
- Includes procedures:
gds.alpha.ml.pipeline.linkPrediction.create|addNodeProperty|addFeature|configureSplit|configureParams|train|predict.mutate
.
- Includes procedures:
- Introduced support for exporting additional node properties, including strings, from the underlying database.
- Added
additionalNodeProperties
parameter togds.graph.export
- Added
additionalNodeProperties
parameter togds.graph.export.csv
- Added
- Introduced experimental support for querying the in-memory graph with Cypher
- Added
gds.alpha.create.cypherdb
to allow neo4j to recognize the in-memory graph as a database for Cypher queries
- Added
- To allow users better ability to handle multiple concurrent users, we’ve added a system monitoring procedure,
gds.alpha.systemMonitor,
to provide an overview of the system’s workload and available resources. - Progress logging is now turned on by default, and no longer requires changing your configuration settings. Progress can be accessed with
gds.beta.listProgress
- GraphSAGE now supports deterministic results with the
randomSeed
configuration parameter togds.beta.graphSage.train
. - Improve performance (up to 20x speedup) of weakly connected components,
gds.wcc,
for undirected graphs by applying a subgraph sampling optimization.
Bug fixes
- Fixed a bug regarding weighted graphs with multiple relationship types, which affected
gds.beta.graphSage
andgds.alpha.spanningTree
. - Supervised Machine Learning (Node Classification & Link Prediction):
- Fixed a
NaN
issue in NodeClassification where computations with very small probability values can cause the result to flip to infinity. - Fixed a bug in seeded NodeClassification and LinkPrediction which lead to non-deterministic behaviour.
- Corrected the training size used in
gds.alpha.ml.linkPrediction.train
. This affects thepenality
parameter used in logistic regression.
- Fixed a
- Progress Logging:
- Fixed a bug in beta progress event tracking where progress events would not be released if computation was abandoned before completion.
- Fixed a bug in beta progress event tracking for Pregel algorithms where progress events would not be released on algorithm completion.
- Node Similarity & KNN:
- Fixed a bug where on a node-filtered multi-relationship-type graph KNN and NodeSimilarity could write out of bounds.
- Fixed a bug which affected
gds.nodeSimilarity.write
andgds.alpha.knn.write
when being executed in combination with anodeLabels
filter. The bug either led to an exception or to wrong results due to an incorrect mapping between internal and Neo4j node ids. - Fixed a bug where
gds.nodeSimilarity.[write|mutate]
andgds.beta.knn.[write|mutate]
wrote duplicate relationships if the input graph is undirected.
- KNN:
- Fixed a bug in
gds.beta.knn
where negative values in node properties of type float arrays failed when returning thesimilarityDistribution
.
- Fixed a bug in
- Fast RP:
- FastRP stream mode explicitly returns a list of floats rather than a list of numbers. This agrees with the other embeddings, and saves users from having to cast/transform when processing the results further in Cypher.
- GraphSAGE:
- Fixed a bug in weighted GraphSAGE where the relationshipWeightProperty was not loaded.
- Fixed a bug in
gds.beta.graphSage
, where the concurrency parameter was not considered.
- Graph Operations:
- Fixed a bug in
gds.graph.removeNodeProperties
whereremovedPropertiesWritten
was too large for properties shared across multiple labels. - Fixed a bug in
gds.beta.graph.generate
, where random graphs with relationship properties could not be generated. - Fixed a bug in
gds.create.subgraph
which could lead to undefined behaviour or an AIOOB exception when executed on GDS Enterprise Edition. - Fixed a bug in
gds.graph.create
, where default values for array properties would throw for convertable types.
Improvements
- Pathfinding: Added existence checks for
sourceNode
andtargetNode
to all shortest path procedures in the product tier. - Improved runtime of
gds.fastRP
via better workload balancing between threads. - Lower memory footprint for LinkPrediction and NodeClassification.
- Improved the procedure output of
gds.beta.listProgress
. - Scale down scores computed by
gds.articleRank
.
- Fixed a bug in
Recent Graph Data Science Releases
- Graph Data Science 2.11
- Graph Data Science 2.10.1
- Graph Data Science 2.9.0
- Graph Data Science 2.8.0
- Graph Data Science 2.7.0