Twin4j

This Week in Neo4j – RMarkdown, New APOC release, Finding Duplicates

Developer Relations Engineer

March 3, 2018

4 min read

Welcome to this week in Neo4j where we round up what’s been happening in the world of graph databases in the last 7 days.

This week we have a new release of APOC, Neo4j in RMarkdown, finding duplicate users, Cypher for Gremlin, and more.

Featured Community Member: Irene Iriarte Carretero

This week’s featured community member is Irene Iriarte Carretero, Data Scientist at Gousto.

Irene Iriarte Carretero – This Week’s Featured Community Member

I first met Irene at a Neo4j London meetup 18 months ago when she was exploring whether her team could use graphs to make sense of recipe data.

Within 6 months of attending that meetup Irene had presented at the London meetup and written a blog post explaining how her team were now using Neo4j to build recipe ontologies.

Irene went on to present at GraphConnect NYC 2017 and at this week’s GraphTour London event. The London event wasn’t recorded but you can find the video from Irene’s talk in New York below.

On behalf of the Neo4j community, thanks for all your work Irene!

Documentum Recommendation Engine, Russian Twitter Trolls, Finding Duplicate Users

Yuri Simione has written a blog post about a content recommendation engine for the Documentum Content Server that he’s been working on. Yuri is looking for Documentum users to beta test the technology so get in touch if that’s you.
In Max De Marzi‘s latest blog post he shows how to write a user defined procedure that combines graph pattern matching and fuzzy text search to find duplicate people in a graph.
Will Lyon presented Graph Analysis of Russian Twitter Trolls Using Neo4j at Stanford’s EE Computer Systems Colloquium.
Melvin Vivas wrote an experience report of using Neo4j for the first time. Melvin shares some useful resources for learning about Cypher and shows how to build a graph of your family tree.

APOC adds HDFS support, aggregation functions, and path functions

This week Michael released a new version of the popular APOC library. The library just crossed 500 GitHub stars so thanks to everyone for your support.

This release has lots of new goodies to play with, including support for writing and reading from HDFS, using S3 URIs when loading data, aggregation functions, full document indexing, path expander sequences, and much more.

The library now contains more than 400 procedures and functions so there’s bound to be something in there that’s useful for your project.

Don’t forget to star the project if it’s been helpful so that more people can find it.

Star

Cypher for Gremlin, Neo4j in RMarkdown, Cypher vs SQL aggregations

Benjamin Raethlein shared his notes from GraphTour Berlin, at which the Cypher for Gremlin plugin was announced. If you want to give it a try you can grab it from the cypher-for-gremlin repository.
Colin Fay has started working on rmd4j, which aims to provide a knitr engine for running Neo4J inside RMarkdown. This one is still in its early stages so don’t forget to give Colin some feedback if you try it out.
Conrad Taylor has written up some notes from the NetIKX January meeting. There’s an interesting comparison of Relational and Graph Databases and also a discussion of linked data and the semantic web.
I rediscovered a 2015 post on the JOOQ blog that compares and contrasts the way that Cypher and SQL deal with aggregation queries. If you’ve ever wondered how Cypher’s count() function works this post has a great explanation.
In Implementing, Testing and Running Procedures for Neo4j Micha Kops shows how to implement a stored procedure to fetch quality metrics from a graph in a step by step tutorial.

On the podcast: Jonathan Schmidt

This week on the podcast Rik interviewed Jonathan Schmidt, founder and CTO of Waykonect, a startup that offers intelligent vehicle management based on Neo4j.

Jonathan explains how Waykonect use Neo4j to map the relationship between the telematic dongle, the vehicle, the account that manages the vehicle, the driver that drives the vehicle, the trips that are recorded, events that happen on that trip, and the maintenance of the vehicle.

In a very interesting technical discussion they also talk about the rest of Waykonect’s architecture, including Kafka as the messaging back bone and InfluxDB for time series analysis.

Next Week

What’s happening next week in the world of graph databases?

Date	Title	Group	Speaker
March 5th 2017	Machine learning, Knowledge base & Amazone Alexia, le tout avec du Graphe !	Graph Database – Paris	Dr. Vlasta Kus, Christophe Willemsen
March 6th 2017	GraphTour Paris	Graph Database – Paris	Mix of Neo4j and customer speakers
March 6th 2017	Copenhagenizing Graph Databases: Demos and Real-World Applications	Copenhagen Graph Databases Meetup	Thomas Frisendal, Maria Scharin, Fabio Lamanna and Omar Rampado, Thomas Thejn, Pedro Parraguez
March 8th 2017	GraphTour Stockholm	Friends of Neo4j Stockholm	Mix of Neo4j and customer speakers
March 8th 2017	Data Science in Practice: Importing and Visualizing Facebook Data Using Graphs!	Graph Database – San Francisco	Ray Bernard, Jennifer Webb

Date

Title

Group

Speaker

March 5th 2017

Machine learning, Knowledge base & Amazone Alexia, le tout avec du Graphe !

Graph Database – Paris

Dr. Vlasta Kus, Christophe Willemsen

March 6th 2017

GraphTour Paris

Graph Database – Paris

Mix of Neo4j and customer speakers

March 6th 2017

Copenhagenizing Graph Databases: Demos and Real-World Applications

Copenhagen Graph Databases Meetup

Thomas Frisendal, Maria Scharin, Fabio Lamanna and Omar Rampado, Thomas Thejn, Pedro Parraguez

March 8th 2017

GraphTour Stockholm

Friends of Neo4j Stockholm

Mix of Neo4j and customer speakers

March 8th 2017

Data Science in Practice: Importing and Visualizing Facebook Data Using Graphs!

Graph Database – San Francisco

Ray Bernard, Jennifer Webb

Tweet of the Week

My favourite tweet this week was by David Meza:

I am really liking this stream to gephi procedure in @neo4j. Here is our lesson learned db, green Topic nodes sized by # of pink lessons in the topic. Next to show correlation between topics. I’ll stop soon. #rstats #neo4j pic.twitter.com/mEmZSbhrOK

— David Meza (@davidmeza1) February 26, 2018