Cassandra batch query vs single insert performance

做~自己de王妃 提交于 2020-03-13 07:46:48

问题


I use Cassandra java driver.

I receive 150k requests per second, which I insert to 8 tables having different partition keys.

My question is which is a better way:

  • batch inserting to these tables
  • inserting one by one.

I am asking this question because , considering my request size (150k), batch sounds like the better option but because all the tables have different partition keys, batch appears expensive.


回答1:


Please check my answer from below link:

Cassandra batch query performance on tables having different partition keys

Batches are not for improving performance. They are used for ensuring atomicity and isolation.

Batching can be effective for single partition write operations. But batches are often mistakenly used in an attempt to optimize performance. Depending on the batch operation, the performance may actually worsen.

https://docs.datastax.com/en/cql/3.3/cql/cql_using/useBatch.html

If data consistency is not needed among those tables, then use single insert. Single requests are distributed or propagated properly (depends on load balancing policy) among nodes. If you are concerned about request handling and use batch, batches will burden so many extra works on coordinator nodes which will not be efficient I guess :)




回答2:


Batches have a HUGE impact on performance instead. The sollution that best suits you as I understand to split into diffirent lists per partition keys and then use batch statements. You will see a huge impact on performance.



来源:https://stackoverflow.com/questions/42930498/cassandra-batch-query-vs-single-insert-performance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!