Welcome to this week in Neo4j where we round up what’s been happening in the world of graph databases in the last 7 days.
This week we have a new release of APOC, Neo4j in RMarkdown, finding duplicate users, Cypher for Gremlin, and more.
Featured Community Member: Irene Iriarte Carretero
This week’s featured community member is Irene Iriarte Carretero, Data Scientist at Gousto.
Irene Iriarte Carretero – This Week’s Featured Community Member
I first met Irene at a Neo4j London meetup 18 months ago when she was exploring whether her team could use graphs to make sense of recipe data.
Within 6 months of attending that meetup Irene had presented at the London meetup and written a blog post explaining how her team were now using Neo4j to build recipe ontologies.
Irene went on to present at GraphConnect NYC 2017 and at this week’s GraphTour London event. The London event wasn’t recorded but you can find the video from Irene’s talk in New York below.
On behalf of the Neo4j community, thanks for all your work Irene!
Documentum Recommendation Engine, Russian Twitter Trolls, Finding Duplicate Users
- Yuri Simione has written a blog post about a content recommendation engine for the Documentum Content Server that he’s been working on. Yuri is looking for Documentum users to beta test the technology so get in touch if that’s you.
- In Max De Marzi‘s latest blog post he shows how to write a user defined procedure that combines graph pattern matching and fuzzy text search to find duplicate people in a graph.
- Will Lyon presented Graph Analysis of Russian Twitter Trolls Using Neo4j at Stanford’s EE Computer Systems Colloquium.
- Melvin Vivas wrote an experience report of using Neo4j for the first time. Melvin shares some useful resources for learning about Cypher and shows how to build a graph of your family tree.
APOC adds HDFS support, aggregation functions, and path functions
This week Michael released a new version of the popular APOC library. The library just crossed 500 GitHub stars so thanks to everyone for your support.
This release has lots of new goodies to play with, including support for writing and reading from HDFS, using S3 URIs when loading data, aggregation functions, full document indexing, path expander sequences, and much more.
The library now contains more than 400 procedures and functions so there’s bound to be something in there that’s useful for your project.
Don’t forget to star the project if it’s been helpful so that more people can find it.
Cypher for Gremlin, Neo4j in RMarkdown, Cypher vs SQL aggregations
- Benjamin Raethlein shared his notes from GraphTour Berlin, at which the Cypher for Gremlin plugin was announced. If you want to give it a try you can grab it from the cypher-for-gremlin repository.
- Colin Fay has started working on rmd4j, which aims to provide a knitr engine for running Neo4J inside RMarkdown. This one is still in its early stages so don’t forget to give Colin some feedback if you try it out.
- Conrad Taylor has written up some notes from the NetIKX January meeting. There’s an interesting comparison of Relational and Graph Databases and also a discussion of linked data and the semantic web.
- I rediscovered a 2015 post on the JOOQ blog that compares and contrasts the way that Cypher and SQL deal with aggregation queries. If you’ve ever wondered how Cypher’s count() function works this post has a great explanation.
- In Implementing, Testing and Running Procedures for Neo4j Micha Kops shows how to implement a stored procedure to fetch quality metrics from a graph in a step by step tutorial.
On the podcast: Jonathan Schmidt
This week on the podcast Rik interviewed Jonathan Schmidt, founder and CTO of Waykonect, a startup that offers intelligent vehicle management based on Neo4j.
Jonathan explains how Waykonect use Neo4j to map the relationship between the telematic dongle, the vehicle, the account that manages the vehicle, the driver that drives the vehicle, the trips that are recorded, events that happen on that trip, and the maintenance of the vehicle.
In a very interesting technical discussion they also talk about the rest of Waykonect’s architecture, including Kafka as the messaging back bone and InfluxDB for time series analysis.
Next Week
What’s happening next week in the world of graph databases?
Date | Title | Group | Speaker |
---|---|---|---|
March 5th 2017 |
Machine learning, Knowledge base & Amazone Alexia, le tout avec du Graphe ! |
||
March 6th 2017 |
Mix of Neo4j and customer speakers |
||
March 6th 2017 |
Copenhagenizing Graph Databases: Demos and Real-World Applications |
Thomas Frisendal, Maria Scharin, Fabio Lamanna and Omar Rampado, Thomas Thejn, Pedro Parraguez |
|
March 8th 2017 |
Mix of Neo4j and customer speakers |
||
March 8th 2017 |
Data Science in Practice: Importing and Visualizing Facebook Data Using Graphs! |
Ray Bernard, Jennifer Webb |
Tweet of the Week
My favourite tweet this week was by David Meza:
I am really liking this stream to gephi procedure in @neo4j. Here is our lesson learned db, green Topic nodes sized by # of pink lessons in the topic. Next to show correlation between topics. I'll stop soon. #rstats #neo4j pic.twitter.com/mEmZSbhrOK
— David Meza (@davidmeza1) February 26, 2018
Don’t forget to RT if you liked it too.
That’s all for this week. Have a great weekend!
Cheers, Mark