What's currently the best choice to persist graph-like structures? Graph databases (e.g. Neo4j) or RDF triple stores (e.g. Virtuoso)?
For example, we have the following use case:
- the weakly connected graph (similar to the one of scholarly papers in a collection) with nearly 10M nodes;
- quite rare updates;
- critical operations: retrieving particular sub-graphs, updating nodes in a given sub-graph, re-computing link analysis measures (e.g. HITS or PageRank) after updating some nodes.
Providing the standard API to query the data for third party applications (a la Facebook's or Twitter's) is desired as well.
With Virtuoso you have the following working for you:
-- SPARQL, SQL, SPASQL (SPARQL inside SQL), and SQL inside SPARQL support (e.g. for dealing with N-ary relations via magic/function predicates/properties.
-- works as a compact engine (e.g., as exploited via KDE Desktop) or massive DBMS as demonstrated via the live 17 Billion Triples+ LOD Cloud Cache or the smaller DBpedia live instance.
-- includes Full Text indexing and text patterns in SPARQL (via bif:contains) it also included XPath/Xquery (via xcontains)
-- Acid or Non Acid mode ditto Schema-Last when dealing with Property Graph Store
-- Via Transformation Middleware it can pull data from 80+ data sources (includes REST APIs, SOAP services, Hypermedia Resource, ODBC or JDBC accessible relational data sources etc..) and transform into Transient or Persistent Linked Data graphs
-- Linked Data publishing is automatic i.e., post DBMS record creation you have in-built Linked Data Pages that as views into the DBMS. No messing around re. URL-Rewrite rules, 303 redirects or anything like that. InterWeb scale Super Keys just work!
That's it for now :-)
For horizontal scale (thus small to medium sized databases) graph databases like neo4j will currently give better performance for graph traversals. Triplestores are catching up though. The big advantage of a Triple Store compared to a graph database is that data dumps and query language are standardized, which means its a lot easier to move to another product and prevent vendor lock-in.
来源:https://stackoverflow.com/questions/4974243/graph-databases-vs-triple-stores