What Keeps Relational Databases From Horizontal Scaling?

最后都变了- 提交于 2019-12-13 04:05:06

问题


When I researched horizontal scaling for relational databases on the internet, I got the impression that the only option which includes write scaling as well as read scaling is sharding, which seems to be a manual design process that involves complex application specific configurations and is hard to maintain if you need to change your sharding structure.

On the other hand, NoSQL seems to be natively supporting horizontal scaling but it has the drawback of not supporting transactions, ACID etc.

One other concept that seems to have been popular recently is NewSQL databases. And these databases promise to hit the sweet spot by being both ACID compliant and able to horizontally scale, either by automatic sharding or some other innovative architecture.

My question is, if we are using SAN with our relational database, isn't adding more database servers to the cluster and more disks to the SAN going to achieve horizontal scaling? (Adding disks will increase total disk IOPS and throughput as well as disk space.) What will be the bottleneck there so that we need to use a NewSQL database to achieve both ACID and horizontal scaling?


回答1:


Horizontal scaling in relational databases is hard to achieve because when you have tables (or shards of the same table) across the different cluster nodes, joins usually become very inefficient. Additionally, there is a problem of replication and keeping ACID guarantees while ensuring that all replicas have fresh data. However, there is a RDBMS that scales horizontally - MySQL Cluster. From the docs:

MySQL Cluster automatically shards (partitions) tables across nodes, enabling databases to scale horizontally on low cost..

Auto-Sharding in MySQL Cluster

Unlike other sharded databases, users do not lose the ability to perform JOIN operations, sacrifice ACID-guarantees or referential integrity (Foreign Keys) when performing queries and transactions across shards.

In my company, We have been using MySQL Cluster for quite some time and it really works well (and scales horizontally). There is also Citus (recently released) that is built on the top of PostgreSQL, but haven't tried it personally.




回答2:


The answer is "CAP Theorem"

You can have at most 2 of Consistency, Availability or Partition Tolerance but typically it boils down to

(Consistency OR availability) AND Partition Tolerance

Database systems designed with traditional ACID guarantees in mind such as RDBMS choose consistency over availability, whereas systems designed around the BASE philosophy, common in the NoSQL movement for example, choose availability over consistency.[6]

With NoSQL if a node drops out the system stays up, but you may not get the latest data. This of course is a huge no-no in, say, banking or billing systems. But in a Social Media application it is of no consequence.

More examples

  • http://blog.flux7.com/blogs/nosql/cap-theorem-why-does-it-matter
  • https://dzone.com/articles/understanding-the-cap-theorem
  • https://codahale.com/you-cant-sacrifice-partition-tolerance/

From this site

  • CAP theorem - Availability and Partition Tolerance


来源:https://stackoverflow.com/questions/48825977/what-keeps-relational-databases-from-horizontal-scaling

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!