How to decide which NoSQL technology to use?

后端 未结 6 1604
陌清茗
陌清茗 2021-01-30 02:55

What is the pros and cons of MongoDB (document-based), HBase (column-based) and Neo4j (objects graph)?

I\'m particularly interested to know some of the typical use cas

相关标签:
6条回答
  • 2021-01-30 03:07

    Checkout this for at glance comparison of NoSQL dbs:

    http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

    0 讨论(0)
  • 2021-01-30 03:08

    MongoDB

    Scalability: Highly available and consistent but sucks at relations and many distributed writes. It's primary benefit is storing and indexing schemaless documents. Document size is capped at 4mb and indexing only makes sense for limited depth. See http://www.paperplanes.de/2010/2/25/notes_on_mongodb.html

    Best suited for: Tree structures with limited depth

    Use Cases: Diverse Type Hierarchies, Biological Systematics, Library Catalogs

    Neo4j

    Scalability: Highly available but not distributed. Powerful traversal framework for high-speed traversals in the node space. Limited to graphs around several billion nodes/relationships. See http://highscalability.com/neo4j-graph-database-kicks-buttox

    Best suited for: Deep graphs with unlimited depth and cyclical, weighted connections

    Use Cases: Social Networks, Topological analysis, Semantic Web Data, Inferencing

    HBase

    Scalability: Reliable, consistent storage in the petabytes and beyond. Supports very large numbers of objects with a limited set of sparse attributes. Works in tandem with Hadoop for large data processing jobs. http://www.ibm.com/developerworks/opensource/library/os-hbase/index.html

    Best suited for: directed, acyclic graphs

    Use Cases: Log analysis, Semantic Web Data, Machine Learning

    0 讨论(0)
  • 2021-01-30 03:14

    You could also evaluate a Multi-Model DBMS, as the second generation of NoSQL product. With a Multi-Model you don't have all the compromises on choosing just one model, but rather more than one model.

    The first multi-model NoSQL is OrientDB.

    0 讨论(0)
  • 2021-01-30 03:19

    Pretty decent article here on MongoDB and NoRM (.net extensions for MongoDB) http://lukencode.com/2010/07/09/getting-started-with-mongodb-and-norm/

    0 讨论(0)
  • 2021-01-30 03:21

    MongoDB:

    MongoDB is document database unlike Relational database. The document stores semi structured data like JSON object ( schema free)

    Key features:

    1. Schema can change over evolution of application
    2. Full indexing
    3. Load balancing & Data sharding
    4. Data replication
    5. Consistency & Partitioning in CAP theory ( Consistency-Availability-Partitioning)

    When to use:

    1. Real time analytics
    2. High speed logging
    3. Semi structured data management

    When not to use:

    1. Highly transactional applications with strong ACID properties ( Atomicity, Consistency, Isolation & Durability). RDBMS is preferred in this use case.
    2. Operating on data sets involving relations - foreign keys etc

    HBASE:

    HBase is an open source, non-relational, distributed column family database

    Key features:

    1. It provides a fault-tolerant way of storing large quantities of sparse data (small amounts of information caught within a large collection of empty or unimportant data, such as finding the 50 largest items in a group of 2 billion records, or finding the non-zero items representing less than 0.1% of a huge collection)
    2. Supports variable schema where each row is different
    3. Can serve as the input and output for MapReduce job
    4. Compression, in-memory operation, and Bloom filters on a per-column (A data structure designed to tell you, rapidly and memory-efficiently, whether an element is present in a set) 5.Achieve CP on CAP

    When to use HBase:

    1. If you’re loading data by key, searching data by key (or range), serving data by key, querying data by key
    2. Storing data by row that doesn’t conform well to a schema (variable schema)

    When not to use HBase:

    1. For relational analytics
    2. Full table scans
    3. Data to be aggregated, analyzed by rows instead of columns

    Neo4j:

    Neo4j is graph database using Property Graph Data Model (Data is stored as a graph and nodes & relationships with properties)

    Key features:

    1. Supports full ACID(Atomicity, Consistency, Isolation and Durability) rules
    2. Supports Indexes by using Apache Lucence
    3. Schema free, bottom-up data model design
    4. High scalability has been achieved due to compact storage and memory caching available for graphs

    When to use:

    1. Master data management
    2. Network and IT Operations
    3. Real time recommendations
    4. Fraud detection
    5. Social network (like facebook)

    When not to use:

    1. Bulk queries/Scans
    2. If your application requires Partitioning & Sharding of data

    Have a look at comparison of various NoSQL technologies in this article

    Sources:

    Wiki, Slide share, Cloudera,Tutorials Point,Neo4j

    0 讨论(0)
  • 2021-01-30 03:28

    I know this might seem like an odd place to point to but, Heroku has recently gone nuts with their noSQL offerings and have an OK overview of many of the current projects. It is in no way a Slideshare press but it will help you start the comparison process:

    http://blog.heroku.com/archives/2010/7/20/nosql/?utm_medium=email&utm_source=EmailBlast&utm_content=619506254&utm_campaign=HerokuSeptemberNewsletter-VersionB&utm_term=NoSQLHerokuandYou

    0 讨论(0)
提交回复
热议问题