Minimum Weight k-Spanning TreeAlpha
This feature is in the alpha tier. For more information on feature tiers, see API Tiers.
Glossary
- Directed
-
Directed trait. The algorithm is well-defined on a directed graph.
- Directed
-
Directed trait. The algorithm ignores the direction of the graph.
- Directed
-
Directed trait. The algorithm does not run on a directed graph.
- Undirected
-
Undirected trait. The algorithm is well-defined on an undirected graph.
- Undirected
-
Undirected trait. The algorithm ignores the undirectedness of the graph.
- Heterogeneous nodes
-
Heterogeneous nodes fully supported. The algorithm has the ability to distinguish between nodes of different types.
- Heterogeneous nodes
-
Heterogeneous nodes allowed. The algorithm treats all selected nodes similarly regardless of their label.
- Heterogeneous relationships
-
Heterogeneous relationships fully supported. The algorithm has the ability to distinguish between relationships of different types.
- Heterogeneous relationships
-
Heterogeneous relationships allowed. The algorithm treats all selected relationships similarly regardless of their type.
- Weighted relationships
-
Weighted trait. The algorithm supports a relationship property to be used as weight, specified via the relationshipWeightProperty configuration parameter.
- Weighted relationships
-
Weighted trait. The algorithm treats each relationship as equally important, discarding the value of any relationship weight.
Introduction
Sometimes, we might require a spanning tree(a tree where its nodes are connected with each via a single path) that does not necessarily span all nodes in the graph.
The K-Spanning tree heuristic algorithm returns a tree with k
nodes and k − 1
relationships.
Our heuristic processes the result found by Prim’s algorithm for the Minimum Weight Spanning Tree problem.
Like Prim, it starts from a given source node, finds a spanning tree for all nodes and then removes nodes using heuristics to produce a tree with 'k' nodes.
Note that the source node will not be necessarily included in the final output as the heuristic tries to find a globally good tree.
Considerations
The Minimum weight k-Spanning Tree is NP-Hard. The algorithm in the Neo4j GDS Library is therefore not guaranteed to find the optimal answer, but should hopefully return a good approximation in practice.
Like Prim algorithm, the algorithm focuses only on the component of the source node. If that component has fewer than k
nodes, it will not look into other components, but will instead return the component.
Syntax
This section covers the syntax used to execute the k-Spanning Tree heuristic algorithm in each of its execution modes. We are describing the named graph variant of the syntax. To learn more about general syntax variants, see Syntax overview.
- Write mode
CALL gds.kSpanningTree.write(
graphName: String,
configuration: Map
)
YIELD effectiveNodeCount: Integer,
preProcessingMillis: Integer,
computeMillis: Integer,
postProcessingMillis: Integer,
writeMillis: Integer,
configuration: Map
Name | Type | Default | Optional | Description |
---|---|---|---|---|
graphName |
String |
|
no |
The name of a graph stored in the catalog. |
configuration |
Map |
|
yes |
Configuration for algorithm-specifics and/or graph filtering. |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
List of String |
|
yes |
Filter the named graph using the given node labels. Nodes with any of the given labels will be included. |
|
List of String |
|
yes |
Filter the named graph using the given relationship types. Relationships with any of the given types will be included. |
|
Integer |
|
yes |
The algorithm is single-threaded and changing the concurrency parameter has no effect on the runtime. |
|
String |
|
yes |
An ID that can be provided to more easily track the algorithm’s progress. |
|
Boolean |
|
yes |
If disabled the progress percentage will not be logged. |
|
Integer |
|
yes |
The number of concurrent threads used for writing the result to Neo4j. |
|
String |
|
no |
The node property in the Neo4j database to which the spanning tree is written. |
|
k |
Number |
|
no |
The size of the tree to be returned |
sourceNode |
Integer |
|
n/a |
The starting source node ID. |
String |
|
yes |
Name of the relationship property to use as weights. If unspecified, the algorithm runs unweighted. |
|
objective |
String |
|
yes |
If specified, the parameter dictates whether to seek a minimum or the maximum weight k-spanning tree. By default, the procedure looks for a minimum weight k-spanning tree. Permitted values are 'minimum' and 'maximum'. |
Name | Type | Description |
---|---|---|
effectiveNodeCount |
Integer |
The number of visited nodes. |
preProcessingMillis |
Integer |
Milliseconds for preprocessing the data. |
computeMillis |
Integer |
Milliseconds for running the algorithm. |
postProcessingMillis |
Integer |
Milliseconds for postprocessing results of the algorithm. |
writeMillis |
Integer |
Milliseconds for writing result data back. |
configuration |
Map |
The configuration used for running the algorithm. |
Minimum Weight k-Spanning Tree algorithm examples
In this section we will show examples of running the k-Spanning Tree heuristic algorithm on a concrete graph. The intention is to illustrate what the results look like and to provide a guide in how to make use of the algorithm in a real setting. We will do this on a small road network graph of a handful nodes connected in a particular pattern. The example graph looks like this:
CREATE (a:Place {id: 'A'}),
(b:Place {id: 'B'}),
(c:Place {id: 'C'}),
(d:Place {id: 'D'}),
(e:Place {id: 'E'}),
(f:Place {id: 'F'}),
(g:Place {id: 'G'}),
(d)-[:LINK {cost:4}]->(b),
(d)-[:LINK {cost:6}]->(e),
(b)-[:LINK {cost:1}]->(a),
(b)-[:LINK {cost:3}]->(c),
(a)-[:LINK {cost:2}]->(c),
(c)-[:LINK {cost:5}]->(e),
(f)-[:LINK {cost:1}]->(g);
MATCH (source:Place)
OPTIONAL MATCH (source)-[r:LINK]->(target:Place)
RETURN gds.graph.project(
'graph',
source,
target,
{ relationshipProperties: r { .cost } },
{ undirectedRelationshipTypes: ['*'] }
)
K-Spanning tree examples
Minimum K-Spanning Tree example
In our sample graph we have 7 nodes.
By setting the k=3
, we define that we want to find a 3-minimum spanning tree that covers 3 nodes and has 2 relationships.
MATCH (n:Place{id: 'A'})
CALL gds.kSpanningTree.write('graph', {
k: 3,
sourceNode: n,
relationshipWeightProperty: 'cost',
writeProperty:'kmin'
})
YIELD preProcessingMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN preProcessingMillis,computeMillis,writeMillis, effectiveNodeCount;
MATCH (n)
WITH n.kmin AS p, count(n) AS c
WHERE c = 3
MATCH (n)
WHERE n.kmin = p
RETURN n.id As Place, p as Partition
Place | Partition |
---|---|
"A" |
0 |
"B" |
0 |
"C" |
0 |
Nodes A, B, and C form the discovered 3-minimum spanning tree of our graph.
Maximum K-Spanning Tree example
MATCH (n:Place{id: 'D'})
CALL gds.kSpanningTree.write('graph', {
k: 3,
sourceNode: n,
relationshipWeightProperty: 'cost',
writeProperty:'kmax',
objective: 'maximum'
})
YIELD preProcessingMillis, computeMillis, writeMillis, effectiveNodeCount
RETURN preProcessingMillis,computeMillis,writeMillis, effectiveNodeCount;
MATCH (n)
WITH n.kmax AS p, count(n) AS c
WHERE c = 3
MATCH (n)
WHERE n.kmax = p
RETURN n.id As Place, p as Partition
Place | Partition |
---|---|
"C" |
4 |
"D" |
4 |
"E" |
4 |
Nodes C, D, and E form a 3-maximum spanning tree of our graph.