The recent proliferation of database technologies is a testament to the fact that relational databases are not the right tool for every job.
Of course, they have their uses: Tabular data with a consistent structure and fixed schema is a perfect fit for a relational database (RDBMS). But if your application demands flexibility or highly connected data, then it’s time to look for an alternative to your RDBMS.
In this series on
SQL strain, we’ll dive into the causes – and cures – of relational database performance issues, including the future-proof alternative of
graph databases.
Last week, we covered
five sure signs of SQL strain. This week, we’ll discuss the impact of a graph data model and two approaches to solving the problems of connected data.
The Impact of the Graph Database Model
Relational databases such as Oracle and
MySQL excel when it comes to capturing repetitive, tabular data. Despite the word “relational” in their name,
relational databases are much less effective at storing or expressing relationships between stored data elements.
The word “relational” in relational databases comes from relating columns within a table, not relating information in different tables. Relationships between columns exist to support set operations. This is very different from the real world where relationships exist between individual data elements.
Consider the impact that using a graph data model can have in three important areas:
- Modeling data with a high number of data relationships
- Flexibly expanding the model to add new data or data relationships
- Querying data relationships in real-time
In discussions, we draw on a whiteboard and sketch connections between data elements,
creating a natural and intuitive data model. Attempting to take a data model based on relationships and forcing it into a tabular framework creates a mental disconnect between the way business stakeholders think about data and processes and the way the database model is implemented.
Developer productivity also suffers because the tabular data model is complex, hard to understand, and does not match the developer’s mental model of the application (this concept is also called “object relational impedance mismatch”).
The mismatch between the intuitive, related data model from our whiteboard and the tables that will be created in the relational database leads to longer development time, higher project costs, and significant delays in getting to market, as the logical model is painstakingly crafted into a physical model.
The value of the graph data model becomes even clearer when it’s time to flexibly expand the model to add new data or data relationships. Projects with rapidly evolving requirements or data sources (which are often the most business critical) are hit hardest by the rigidity of relational database models, as changes to the model often require reworking the application, and (if data has already been loaded) migrating the data itself.
With a graph data model, changes to the data model can be made with little or no impact to the application.
Unlike a relational database, a graph database is
structured entirely around data relationships. Graph databases treat relationships not as a schema structure but as data, like other values.
From a relational database standpoint, you could think of this as pre-materializing JOINs
once at insertion time instead of computing them for
every query. Because the data is structured entirely around data relationships, real-time query performance can be achieved
no matter how large or connected the dataset gets.
The best way to understand the difference between relational databases and graph databases is to walk through a sample use case.
Example Use Case: Two Approaches to Solving a Connected Data Problem
How do relational and graph databases compare from a project standpoint? In order to contrast how you approach development with a relational versus a graph database, let’s look at a specific example: a simple
product recommendation engine.
The data in this case is highly connected: Customers relate to products and brands, products relate to other products and brands, and finally, customers relate to other customers. Almost every online retail organization is interested in building a recommendation engine where value is derived from data relationships.
For a recommendations engine there are three key requirements:
- Model data and the data relationships to understand how recommendations can be made
- Make recommendations in real-time by querying the data relationships
- Continually make the model richer by adding more data and more relationships
In the following weeks, we’ll compare how to approach creating a recommendation engine with an RDBMS and with a graph database, covering the topics of creating database models, writing queries, query performance and evolving the application.
Are your data-driven insights being hindered by the limited capabilities of a relational database? Click below to download a free copy of this white paper, Overcoming SQL Strain and SQL Pain and discover how to harness connected data like never before.
Catch up with the rest of the SQL Strain series: