In case you didn’t notice already,
graph databases like
Neo4j are
hot nowadays. People
ask questions, write
about them, also in the contexts of
NOSQL and RDF. Recently Twitter
open sourced their graphdb implementation, targeted at shallow, distributed graphs. And then Facebook revealed their new
Graph API using the
Open Graph Protocol.
Today, we’re going to show you how easy it is to use the Facebook Graph API to mash up data from Facebook with data in a locally hosted graph database!
It’s movie time!
Let’s say you want to see a movie with one of your friends. Wouldn’t it be neat with a service that uses the Facebook
social graph
to collect movies your friend liked, and combines this with IMDB data to produce a movie suggestion? Turns out that an app like that is pretty straight forward with a graph database.
The first step is to connect to Facebook to fetch a list of your friends, so that’s where the app will start out:
Next a list of your friends will show up:
Now, just click one of your friends and a movie suggestion will be generated:
Under the Hood
What we need to do is simply to let our mashup talk to both the Facebook Graph API and the IMDB API. Uh-oh – IMDB doesn’t have a public API that you can throw requests at. Well, that’s simple enough: we’ll just import the data into a local Neo4j
graph database and then access it through the Facebook Graph API!
So, let’s see how to solve this. Here’s the basic structure of our app:
MovieNight.js
is the mashup itself, embedded in the web page. It uses the Facebook Graph API to get information about the friends of the visitor and the movies that your friends like.
SuggestionEngine.js
uses the Graph API to talk to a Neo4j database containing movie information (a small example data set from
IMDB). The movie suggestion is based on what movies your friend has liked in the past. It simply tries to find other movies starring some actor from the liked ones.
Using the same Graph API to connect to both Facebook and the Neo4j graph database backend makes for convenience: it means that you can use tools written for Facebook for locally hosted data as well – and that’s what we’re doing here. To download the source, go to the download page.
Facebook data
To get your friends from Facebook, just use the common Facebook graph API:
FB.api('/me/friends', function(response) {
friends = response.data;
// Load friends into UI
friend_list.empty();
for ( var i = 0; i < friends.length; i ++ ) {
add_friend( friends[i] ); // write to UI
}
});
Getting the movies a friend likes is very similar to getting the friends list:
FB.api("/" + friend.id + "/movies", function(result)
{
/* handle the response here */
}
For more information, see the
Graph API documentation.
Neo4j data
To connect to the Neo4j graph server we had to
hack the connect-js library slightly, as it’s hard coded to send requests to facebook.com. What we added is the possibility to add prefixes for different data sources. It still defaults to graph.facebook.com etc., but makes a “fb:” prefix available to make your code easier to read. To hook in a data source, we modify the FB.init() call like this:
FB.init({
appId : '', // NOTE: create an appid and add it here
status : true, cookie : true, xfbml : true,
// time to add our IMDB backend to the mix
external_domains : {
imdb : 'https://localhost:4567/'
}
});
Now we’re able to send reqests to our own server as well, using code similar to the following:
FB.api("imdb:/path/to/data/in/graph", function(data) {
// data is available here :)
});
So now that we can send requests, what can we do with the Neo4j backend here? Here’s a comprehensive list showing precisely that in some detail (all requests are
GET
from
https://localhost:4567
):
Get Actor (or Movie) by Id |
Request |
Response |
/56 |
{
"name": "Bacon, Kevin",
"id": 56
}
|
Extended information about Actor(/Movie) |
Request |
Response |
/56?metadata=1 |
{
"name": "Bacon, Kevin",
"id": 56,
"metadata": {
"connections": "https://localhost:4567/56/acted_in"
},
"type": "actor"
}
|
All the Movies an Actor had a Role in |
Request |
Response |
/56/acted_in |
{
"data": [
{
"id": 57,
"title": "Woodsman, The (2004)"
},
{
"id": 59,
"title": "Wild Things (1998)"
}
// tons of movies here ...
]
}
|
Get (Actor or) Movie by Id |
Request |
Response |
/59 |
{
"title": "Wild Things (1998)",
"year": "1998",
"id": 59
}
|
Extended information about (Actor/)Movie |
Request |
Response |
/59?metadata=1 |
{
"title": "Wild Things (1998)",
"year": "1998",
"id": 59,
"metadata": {
"connections": "https://localhost:4567/59/actors"
},
"type": "movie"
}
|
All the Actors that have a Role in this Movie |
Request |
Response |
/59/actors |
{
"data": [
{
"id": 56,
"name": "Bacon, Kevin"
},
{
"id": 528,
"name": "Dillon, Matt (I)"
}
// loads of actors here ...
]
}
|
Search for Actors with “bacon” in their name |
Request |
Response |
/search?q=bacon&type=actor |
[
{
"name": "Bacon, Kevin",
"id": 56
},
{
"name": "Bacon, Travis",
"id": 14242
}
// more bacons here ...
]
|
Search for Movies with “wild things” in their title |
Request |
Response |
/search?q=wild%20things&type=movie |
[
{
"title": "Wild Things (1998)",
"year": "1998",
"id": 59
},
{
"title": "River Wild, The (1994)",
"year": "1994",
"id": 74
}
// more wild movies here ...
]
|
Ok, but how do we use this stuff then?! Well, that’s what we’re going to look into right away, to see the Facebook Graph API used from JavaScript with a Neo4j/IMDB backend. To get started, here’s how to perform a search:
self.movie_info = function( movie_name, callback ) {
// The search API uses commas for AND-type searches, spaces become OR, so for
// the movie names, we switch spaces out for commas.
movie_name = movie_name.replace(/ /g, ",");
FB.api("imdb:/search", {type:'movie', q:movie_name }, callback );
};
The request to get the movies an actor has acted in goes like this:
FB.api("imdb:/" + actor.id + "/acted_in", function( result ) {
for (var i = 0; i < result.data.length; i++)
{
movie = result.data[i];
// do something with the movie here!
}
});
To get all actors in a movie, simply use the following request:
FB.api("imdb:/" + movie.id + "/actors", function(result) {
for (var i = 0; i < result.data.length; i++)
{
actor = result.data[i];
// do something with the actor here!
}
});
Actually, these three different requests are all our small suggestion engine needs to fullfill it’s task. Have a look at
SuggestionEngine.js
to see the full code.
How to create a Graph API service on top of Neo4j
Let’s take a closer look at the movie backend now. It’s built using the
Neo4j Ruby bindings. In our example data set we have Actors and Movies connected through Roles, here’s how these look in Ruby code:
class Movie; end
class Role
include Neo4j::RelationshipMixin
property :title, :character
end
class Actor
include Neo4j::NodeMixin
property :name
has_n(:acted_in).to(Movie).relationship(Role)
index :name, :tokenized => true
end
class Movie
include Neo4j::NodeMixin
property :title
property :year
index :title, :tokenized => true
# defines a method for traversing incoming acted_in relationships from Actor
has_n(:actors).from(Actor, :acted_in)
end
The code above is from the
backend/model.rb
file. On the Neo4j level, this is the kind of structure we’ll have:
By defining indexes on Actor and Movie we can later use the
find
method on the classes to perform searches.
Our next step is to expose this model over the Graph API, where we’ll use
Sinatra and
WEBrick to do the heavy lifting. The application is defined in the
backend/neo4j_app.rb
file – we’ll dive into portions of that code right here. To begin with, how to return data for an Actor or Movie by Id?
get '/:id' do # show a node
content_type 'text/javascript'
node = node_by_id(params[:id])
props = external_props_for(node)
props.merge! metadata_for(node) if params[:metadata] == "1"
json = JSON.pretty_generate(props)
json = callback_wrapper(json, params[:callback])
json
end
The
Sinatra route above uses a few small utility functions, let’s look into them as well. The first one is very simple, but useful if we want to extend the URIs to allow for requesting for example
/{moviename}/actors
and not only numeric IDs.
def node_by_id(id)
node = Neo4j.load_node(id) if id =~ /^(d+)$/
halt 404 if node.nil?
node
end
The next function returns the properties of a node, while filtering out those that have a name starting with a “
_
” character. It also adds the node id to the result.
def external_props_for(node)
ext_props = node.props.delete_if{|key, value| key =~ /^_/}
ext_props[:id] = node.neo_id
ext_props
end
Then there’s a function that gathers metadata for a node, including a link to the list of connections to other nodes, and the type of the node.
def metadata_for(node)
if node.kind_of? Actor
connections = url_for(node, "acted_in")
elsif node.kind_of? Movie
connections = url_for(node, "actors")
end
metadata = { :metadata => { :connections => connections }, :type => node.class.name.downcase }
end
There’s a couple more utility functions, but we’ll skip them here as they are unrelated to Neo4j.
Next up is getting the relationships from an Actor or Movie. The code will only care about valid paths, that is, paths having
/acted_in
or
/actors
in the end. In other cases, an empty data set is returned. Other than that, it simply delegates the work to the domain classes, by doing
node.send(relationship)
to get the relationships. Using the
send
method in Ruby will here equal the statements
node.acted_in
or
node.actors
.
get '/:id/:relation' do # show a relationship
content_type 'text/javascript'
node = node_by_id(params[:id])
data = []
[ :acted_in, :actors ].each do |relationship|
if params[:relation] == relationship.to_s and node.respond_to? relationship
data = node.send(relationship)
end
end
data = data.map{|node| node_data(node)}
json = JSON.pretty_generate({:data => data})
json = callback_wrapper(json, params[:callback])
json
end
When viewing the relationships, we only want to show the most basic node info, so there’s a utility function to do that as well:
def node_data(node)
data = { :id => node.neo_id }
[ :name, :title ].each do |property|
data.merge!({ property => node[property] }) unless node[property].nil?
end
data
end
Performing the searches are basically handled by adding indexes to the model (see the code further above). So what’s left to do in the application is some sanity checks, delegating the search to the model and finally to format the output properly. Here goes:
get '/search' do
content_type 'text/javascript'
q = params[:q]
type = params[:type]
halt 400 unless q && type
result = case type
when 'actor'
Actor.find(to_lucene(:name, q))
when 'movie'
Movie.find(to_lucene(:title, q))
else
[]
end
json = JSON.pretty_generate(result.map{|node| external_props_for(node)})
json = callback_wrapper(json, params[:callback])
json
end
Wrap up
Here’s some major takeaways from this post:
- Graphs are going mainstream, as evidenced by initiatives like the Facebook Graph API.
- It’s often convenient to look at your data in the form of a graph, and with recent support in graph databases like Neo4j, it’s easy to use different data sources in tandem through the Graph API.
- Exposing data through the Graph API is simple if you have a graphdb backend.
And once you put your data in a graphdb, you can of course do more advanced graphy things too, like finding
shortest paths, routing with A*,
modeling of complex
domains and
whatnot. Just
get started!
Example source code
To get the source code of the example, go to the download page.
Credits
Here’s the guys who wrote the code of the example:
Want to learn more about graph databases? Click below to get your free copy of O’Reilly’s Graph Databases ebook and discover how to use graph technologies for your application today.
Download My Ebook