I am working on a recommendation engine. The user data is collected (Their friendship, locations, likes,education,...) and is already stored in mongodb. I need to recommend rel
Did you take a look at Reco4j? It uses neo4j as underlying graph database. They have implemented few algorithms as part of project. Here is link reco4j. The link is currently unavailable but I found the features good when I had visited the site.
https://github.com/tinkerpop/gremlin/wiki
Gremlin!
Specifically designed to work with Neo4j/graph databases. Also you can easily download GMongo and connect to Mongo as well.
Check out this article that describes how to interact with polyglot data in Gremlin:
http://thinkaurelius.com/2013/02/04/polyglot-persistence-and-query-with-gremlin/
I found two ways to integrate mongodb and Neo4j. The first one was suggested by ryan1234 using Gremlin together with Gmongo. The steps are as following according to this excellent blog
1- Download Gmongo and Java mongo driver
2- copy the two jar files under neo4j/lib directory
3- This is an example. suppose we have this collection (called follows) in mongodb
{ "_id" : ObjectId("4ff74c4ae4b01be7d54cb2d3"), "followed" : "1", "followedBy" : "3", "createdAt" : ISODate("2013-01-01T20:36:26.804Z") }
{ "_id" : ObjectId("4ff74c58e4b01be7d54cb2d4"), "followed" : "2", "followedBy" : "3", "createdAt" : ISODate("2013-01-15T20:36:40.211Z") }
{ "_id" : ObjectId("4ff74d13e4b01be7d54cb2dd"), "followed" : "1", "followedBy" : "2", "createdAt" : ISODate("2013-01-07T20:39:47.283Z") }
from the Gremlin shell in Neo4j run the following commands.
import com.gmongo.GMongo
mongo = new GMongo()
db = mongo.getDB("local")
db.follows.findOne().followed
x=[] as Set; db.follows.find().each{x.add(it.followed); x.add(it.followedBy)}
x.each{g.addVertex(it)}
db.follows.find().each{g.addEdge(g.v(it.followedBy),g.v(it.followed),'follows',[followsTime:it.createdAt.getTime()])}
and that is it we have created the equivalent graph in neo4j
There is another solution if we want to use R. The following R code, will get the data from mongodb
library(RMongo)
library('bitops')
library('RCurl')
library('RJSONIO')
mg <- mongoDbConnect("local", "127.0.0.1", 27017)
mongoData <- dbGetQuery(mg, 'follows',"{}")
The result will be as following
followed followedBy createdAt
1 1 3 Tue Jan 01 15:36:26 EST 2013
2 2 3 Tue Jan 15 15:36:40 EST 2013
3 1 2 Mon Jan 07 15:39:47 EST 2013
The following R code will connect to Neo4j and create the graph. It is not efficient but it works
query <- function(querystring) {
h = basicTextGatherer()
curlPerform(url="http://localhost:7474/db/data/ext/CypherPlugin/graphdb/execute_query",
postfields=paste('query',curlEscape(querystring), sep='='),
writefunction = h$update,
verbose = TRUE
)
result <- fromJSON(h$value())
data <- data.frame(t(sapply(result$data, unlist)))
names(data) <- result$columns
data
}
nodes<-unique(c(mongoData$followed,mongoData$followedBy))
nodes=paste("_",nodes,sep="")
nodes<-paste(paste("(",nodes,collapse="),"),")")
edges<-apply(mongoData[,3:2],1,function(x) paste("_",x,sep="",collapse="-[:follows]->"))
edges<-paste(edges,collapse=",")
cmd<-paste(nodes,edges,sep=",")
cmd=paste("create",cmd)
query(cmd)
For using MongoDB and Neo4j together there is now the Neo4j Doc Manager project which will automatically sync data from MongoDB to Neo4j, converting documents to a property graph structure.