When building scalable applications, developers have a myriad of technologies to choose from, especially when choosing a database technology.
We want to choose the right piece of technology that offers optimal enhancement of functionality or performance boost. However, when adding new technologies to our stack we often end up with more complication in managing our systems than the benefit added by our new technologies.
The idea of polyglot persistence promises to allow us to take advantage of the strengths of different persistence layers to enhance functionality in our application.
In this blog post, we will examine some use cases where it makes sense to use MongoDB and Neo4j together, drawing on the strengths of each database. And finally examine a new community project, the Neo4j Doc Manager for Mongo Connector, that enables real-time synchronization of documents from MongoDB to Neo4j.
Figure 1: By using multiple database technologies, we can enhance our application. Here we use a key-value store to power the user shopping cart, a document database for product catalog search and browsing and a graph database for real time personalized recommendations.
MongoDB – Document Database
MongoDB is one of the leading document databases, a type of NoSQL database with a document-based data model.
Documents can contain key-value pairs, arrays and nested documents (a document of documents). Although MongoDB uses BSON, you can think of JSON as a close analog of the data model.
Using a document data model allows MongoDB to make use of unstructured data and query it in an efficient way, using indexes.
Example: A Product Catalog Use Case
A core use case for a document database is to back search and browsing for a product catalog, supporting an e-commerce application.
The functional requirements here need to support a diverse product portfolio with complex querying and filtering across many product attributes. We need to be able to populate the application view of our product catalog with a single query.
To understand how using a graph database alongside a document database can enhance our application, let’s consider a specific use case: that of a course catalog for an online course system.
Our theoretical system offers courses online for users to take, something like Coursera, Udacity, EdX, etc. Let’s examine a specific view in our application, that of the course catalog:
Figure 2: An online course catalog is a good use case for a document database.
This view allows a user to see a list of all courses available to them, search by keyword and filter by time, language or category.This is a core use case for a document database. The view needs to be populated in a single database query so we’re returning lots of information here.
Queries need to use different types of indexes (full-text, category filter, time range filter) and return all information needed to render the view in a single query.
However something is missing here: all of these operations are done in the context of the user.
Our application knows what courses our user has taken previously and how the user has interacted with other users in our platform so we should be able to offer some personalized content based on these user preferences.
Figure 3: Instead of just showing the results of search / filters, our course catalog should be personalized content, such as course recommendations based on information we know about the preferences of the current user.
We know that Neo4j is very good for generating recommendations. In fact, generating real-time recommendations is a core use case for Neo4j and graph databases.MongoDB might be great at serving up a product catalog, but what if we want to generate user centric / personalized product recommendations based?
We know this is easy in Neo4j, so let’s bring Neo4j into it – but how?
Typically this would involve syncing data at the application layer: write to MongoDB, and write to Neo4j.
Did both operations complete successfully? What should we do if one of the transactions fails? At what point do we update data? This can quickly turn into a very complicated component of our application to maintain.
The benefits of polyglot persistence come at the expense of complexity. We now have to implement logic at the application layer to write data to both MongoDB and Neo4j.
Introducing Neo4j Doc Manager: Enabling Polyglot Persistence for MongoDB and Neo4j
To help facilitate polyglot persistence with Neo4j and MongoDB while minimizing complexity for the developer, we have been working on a project that automatically synchronizes data from MongoDB to Neo4j.
Figure 4: With Neo4j Doc Manager, documents inserted into MongoDB are converted to a property graph model and immediately inserted into Neo4j
Neo4j Doc Manager for Mongo Connector
The Neo4j Doc Manager project is an implementation of the Mongo Connector project, provided by the folks at MongoDB. Mongo Connector provides a mechanism for your application to be notified of all updates in MongoDB and write those updates to a target system – essentially an external transaction handler for MongoDB.
Neo4j Doc Manager works by tailing the oplog (a log of all operations in MongoDB). Whenever there is a document update (such as an insertion, update or removal), the doc manager is notified of the update.
Neo4j Doc Manager contains the logic for converting this document into a property graph model and then immediately writes this update to Neo4j.
Figure 5: Neo4j Doc Manager is notified of all operations in MongoDB and converts those updates to a property graph model and immediately writes those updates to Neo4j.
Turning Documents into Property Graphs
Documents are converted into property graphs based on the structure of the document. Document keys will be turned into nodes. Nested values on each key will become properties.
Figure 6: Documents inserted in MongoDB are converted to property graphs based on the structure of the document.
Consider the following document:
{ "session": { "title": "12 Years of Spring: An Open Source Journey", "abstract": "Spring emerged as a core open source project in early 2003 and evolved to a broad portfolio of open source projects up until 2015." }, "topics": ["keynote", "spring"], "room": "Auditorium", "timeslot": "Wed 29th, 09:30-10:30", "speaker": { "name": "Juergen Hoeller", "bio": "Juergen Hoeller is co-founder of the Spring Framework open source project.", "twitter": "https://twitter.com/springjuergen", "picture": "https://www.springio.net/wp-content/uploads/2014/11/juergen_hoeller-220x220.jpeg" } }
If we insert this document into MongoDB with Neo4j Doc Manager running, this document would be converted into the following property graph:
Figure 7: The talks document is converted to a property graph – one node for the root level document and two additional nodes, one for each sub document.
Attain Your Polyglot Persistence Goals
Ideally, the Neo4j Doc Manager allows the application developer to attain the goal of building an application that leverages polyglot persistence with MongoDB and Neo4j.
In the context of our course catalog, we can seamlessly provide search and browsing from MongoDB, but serve our personalized real-time recommendations (based on what courses the user has taken and how they have interacted with the system) from Neo4j without having to worry about writing data to both MongoDB and Neo4j.
Figure 8: Neo4j Doc Manager – enabling polyglot persistence for Neo4j and MongoDB.
The Neo4j Doc Manager is available to the community today, and full documentation is available here. Please try it out and let us know if this will help serve your needs for building polyglot applications with Neo4j and MongoDB.
Mongo, MongoDB and the MongoDB leaf logo are registered trademarks of MongoDB, Inc.
Love MongoDB but need to sharpen your skills with Neo4j? Click below to get your free copy of the Learning Neo4j ebook and catch up to speed with the world’s leading graph database.