What is the pros and cons of MongoDB (document-based), HBase (column-based) and Neo4j (objects graph)?
I\'m particularly interested to know some of the typical use cas
Checkout this for at glance comparison of NoSQL dbs:
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
MongoDB
Scalability: Highly available and consistent but sucks at relations and many distributed writes. It's primary benefit is storing and indexing schemaless documents. Document size is capped at 4mb and indexing only makes sense for limited depth. See http://www.paperplanes.de/2010/2/25/notes_on_mongodb.html
Best suited for: Tree structures with limited depth
Use Cases: Diverse Type Hierarchies, Biological Systematics, Library Catalogs
Neo4j
Scalability: Highly available but not distributed. Powerful traversal framework for high-speed traversals in the node space. Limited to graphs around several billion nodes/relationships. See http://highscalability.com/neo4j-graph-database-kicks-buttox
Best suited for: Deep graphs with unlimited depth and cyclical, weighted connections
Use Cases: Social Networks, Topological analysis, Semantic Web Data, Inferencing
HBase
Scalability: Reliable, consistent storage in the petabytes and beyond. Supports very large numbers of objects with a limited set of sparse attributes. Works in tandem with Hadoop for large data processing jobs. http://www.ibm.com/developerworks/opensource/library/os-hbase/index.html
Best suited for: directed, acyclic graphs
Use Cases: Log analysis, Semantic Web Data, Machine Learning
You could also evaluate a Multi-Model DBMS, as the second generation of NoSQL product. With a Multi-Model you don't have all the compromises on choosing just one model, but rather more than one model.
The first multi-model NoSQL is OrientDB.
Pretty decent article here on MongoDB and NoRM (.net extensions for MongoDB) http://lukencode.com/2010/07/09/getting-started-with-mongodb-and-norm/
MongoDB:
MongoDB is document database unlike Relational database. The document stores semi structured data like JSON object ( schema free)
Key features:
When to use:
When not to use:
HBASE:
HBase is an open source, non-relational, distributed column family database
Key features:
When to use HBase:
When not to use HBase:
Neo4j:
Neo4j is graph database using Property Graph Data Model (Data is stored as a graph and nodes & relationships with properties)
Key features:
When to use:
When not to use:
Have a look at comparison of various NoSQL technologies in this article
Sources:
Wiki, Slide share, Cloudera,Tutorials Point,Neo4j
I know this might seem like an odd place to point to but, Heroku has recently gone nuts with their noSQL offerings and have an OK overview of many of the current projects. It is in no way a Slideshare press but it will help you start the comparison process:
http://blog.heroku.com/archives/2010/7/20/nosql/?utm_medium=email&utm_source=EmailBlast&utm_content=619506254&utm_campaign=HerokuSeptemberNewsletter-VersionB&utm_term=NoSQLHerokuandYou