What makes Cassandra (and NoSQL in general) a better solution to an RDBMS?

前端 未结 11 2023
醉梦人生
醉梦人生 2021-02-02 12:00

Well, NoSQL is a buzzword right now so I\'ve been looking into it. I\'m yet to get my head around ColumnFamilies and SuperColumns, etc... But I have been looking at how the data

相关标签:
11条回答
  • 2021-02-02 12:26

    The advantage fo NoSql is that its simpler and if you have your OO blinkers on it fullfills all your persistence needs.

    The advantage of SQL based realtional database is that you can easily re-use and extend your data in ways that were not envisaged in the original design. Also "Object" databases tend to perform very badly (even if its possable) when you want to do the equivalent of SQLs aggregate queries like COUNT, SUM, AVG.

    Googles BIGTABLE which is the biggest OO database anywhere (and probably the biggest database period) also supports SQL and sql features like indexing and strong typing.

    0 讨论(0)
  • 2021-02-02 12:27

    The main advantage of NoSQL is horizontal scalability and distributed storage. That means you can have a large number of 'cluster nodes' and write to them in parallel. The cluster will ensure changes are propagated to the other cluster nodes eventually (eventual consistency).

    NoSQL is not so much about SQL (the term means "not only SQL"). In fact, some NoSQL products do support a subset of SQL. The reason the data format is different (JSON or list of property / value pairs versus tabular data) is: within relational databases, the number of columns (and column names) is defined in a central place, which doesn't work well with horizontal scalability (you would need to stop all cluster nodes for schema changes). Also, joins are not supported as much because that would break horizontal scalability (data from multiple cluster nodes may need to be read, if the data is distributed).

    0 讨论(0)
  • 2021-02-02 12:28

    I gave a talk at OSCON about when NoSQL can be the right choice, and some of the different sub-categories to be aware of: http://assets.en.oreilly.com/1/event/45/The%20NoSQL%20Ecosystem%20Presentation.pdf

    0 讨论(0)
  • 2021-02-02 12:29

    Cassandra in and of itself is not better than an RDBMS. It is better under some circumstances. An RDBMS is vastly superior for transaction processing, master data management, reference data, data warehousing and (some forms of) BI.

    Use NOSQL if your application requires a flexible schema, variable length rows, variable types of columns, eventual integrity, horizonal scalability on commodity servers, and high availability achieved by means of a distributed architecture.

    NOSQL does not do joins for several reasons: you already joined the data before the NOSQL file was loaded so there is no need to; because a distributed join over far-reaching servers would be resource intensive. The first reason above is simple: you have embedded all the data you need into a single structure. If you do not embed the data and have to link, don't expect great performance out of it. Linking is a euphemism for application-provided joining without the benefit of consolidating the data as a join does. Assuming hashing a key is the method of data distribution, different records that have the same hash key would be collocated. Thereby if joining were permitted, the joined data would all be on the same server.

    It's not just black and white.

    0 讨论(0)
  • 2021-02-02 12:31

    As many books about NoSQL mention, it's not about which database is better than the other. It's more what you need.

    As everyone say in the other answers, many NoSQL databases support horizontal scalability and are focused on high availability but they are not always the best fit for your needs.

    for example, Cassandra is great to add or remove nodes from a cluster, allowing that high scalability. But when you compare Cassandra with MySQL in an environment with just one node (one server), and with no distributed architecture, there isn't a lot of different, since the main advantages of Cassandra are not used.

    Now, why should you use SQL? The most common reason is transaction management. Currently, no popular NoSQL database natively supports transactions. You can emulate them, but they are not part of the native functionality as in most SQL databases.

    For Cassandra, there is a full and free training in https://academy.datastax.com

    There you won't only find trainings to install and configure Cassandra, but to use its tools. It even gives you completion certificates.

    Datastax has its own distribution of Cassandra, but it follows all the same guidelines as the Apache project; it offers some extra tools.

    0 讨论(0)
提交回复
热议问题