What is the difference between scylla read path and cassandra read path?

后端 未结 5 1047
囚心锁ツ
囚心锁ツ 2021-01-13 12:05

What is the difference between Scylla read path and Cassandra read path? When I stress Cassandra and Scylla then Scylla read performance poor by 5 times than Cassandra using

相关标签:
5条回答
  • 2021-01-13 12:29

    Some other responses focused on write performance, but this isn't what you asked about - you asked about reads.

    Uncached read performance on HDDs is bound to be poor in both Cassandra and Scylla, because reads from disk each requires several seeks on the HDD, and even the best HDD cannot do more than, say, 200 of those seeks per second. Even with a RAID of several of these disks, you will rarely be able to do more than, say, 1000 requests per second. Since a modern multi-core can do orders of magnitude more CPU work than 1000 requests per second, in both Scylla and Cassandra cases, you'll likely see free CPU. So Scylla's main benefit, of using much less CPU per request, will not even matter when the disk is the performance bottleneck. In such cases I would expect Scylla's and Cassandra's performance (I am assuming that you're measuring throughput when you talk about performance?) should be roughly the same.

    If, still, you're seeing better throughput from Cassandra than Scylla, there are several details that may explain why, beyond the general client mis-configuration issues raised in other responses:

    1. If you have low amounts of data, that can fit in memory, Cassandra's caching policy is better for your workload. Cassandra uses the OS's page cache, which reads whole disk pages and may cache multiple items in one read, as well as multiple index entries. While Scylla works differently, and has a row cache - only caching the specific data read. Scylla's caching is better for large volumes of data that do not fit in memory, but much worse when the data can fit in memory, until the entire data set has been cached (after everything is cached, it becomes very efficient again).

    2. On HDDs, the details of compaction are very important for read performance - if in one setup you have more sstables to read, it can increase the number of reads and lower the performance. This can change depending on your compaction configuration, or even randomly (depending on when compaction was run last). You can check if this explains your performance issues by doing a major compaction ("nodetool compact") on both systems and checking the read performance afterwards. You can switch the compaction strategy to LCS to ensure that random-access read performance is better, at the cost of more write work (on HDDs, this can be a worthwhile compromise).

    3. If you are measuring scan performance (reading an entire table) instead of reading individual rows, other issues become relevant: As you may have heard, Scylla subdivides each nodes into shards (each shard is a single CPU). This is fantastic for CPU-bounded work, but could be worse for scanning tables which aren't huge, because each sstable is now smaller and the amount of contiguous data you can read before needing to seek again is lower.

    I don't know which of these differences - or something else - is causing performance of your use-case to be lower in Scylla, but I please keep in mind that whatever you fix, your performance is always going to be bad with HDDs. With SDDs, we've measured in the past more than a million random-access read requests per second on a single node. HDDs cannot come anything close. If you really need optimum performance or performance per dollar, SDDs are really the way to go.

    0 讨论(0)
  • 2021-01-13 12:36

    There can be various reasons why you are not getting the most out of your Scylla Cluster.

    1. Number of concurrent connections from your clients/loaders is not high enough, or you're not using sufficient amount of loaders. In such case, some shards will be doing all the work, while others will be mostly idle. You want to keep your parallelism high.

    2. Scylla likes have a minimum of 2 connections per shard (you can see the number of shards in /etc/scylla.d/cpuset.conf)

    3. What's the size of your dataset? Are you reading a large amount of partitions or just a few? You might be hitting a hot partition situation

    I strongly recommend reading the following docs that will provide you more insights:

    • https://www.scylladb.com/2019/03/27/best-practices-for-scylla-applications/

    • https://docs.scylladb.com/operating-scylla/benchmarking-scylla/

    0 讨论(0)
  • 2021-01-13 12:39

    As a summary I would say Scylladb and cassandra have the same read / write path memtable, commitlog, sstable.

    However implementation is very different: - cassandra rely on OS for low level IO and network (most DBMS does) - scylladb rely on its own lib (seastar) to handle IO and network at a low level independently from OS page cache etc. This is why they can provide feature such as workload scheduling within the same cluster that would be very hard to implement in cassandra.

    0 讨论(0)
  • 2021-01-13 12:49

    Both databases use LSM tree but Scylla has thread-per-core architecture on top plus we use O_Direct while C* uses the page cache. Scylla also has a sophisticated IO scheduler that makes sure not to overload the disk and thus scylla_setup runs a benchmark automatically to tune. Check your output of it in io.conf.

    There are far more things to review, better to send your data to the mailing list. In general, Scylla should perform better in this case as well but your disk is likely to be the bottleneck in both cases.

    0 讨论(0)
  • 2021-01-13 12:52

    @Sateesh, I want to add to the answer by @TomerSan that both Cassandra and ScyllaDB utilize the same disk storage architecture (LSM). That means that they have relatively the same disk access patterns because the algorithms are largely the same. The LSM trees were built with the idea in mind that it is not necessary to do instant in-place updates. It consists of immutable data buckets that are large continuous pieces of data on disk. That means less random IO, more sequential IO for which the HDD works great (not counting utilized parallelism by modern database implementations).

    All the above means that the difference that you see, is not induced by the difference in how those databases use a disk. It must be related to the configuration differences and what happens underneath. Maybe ScyllaDB tries to utilize more parallelism or more aggressively do compaction. It depends.

    In order to be able to say anything specific, please share your tests, envs, and configurations.

    0 讨论(0)
提交回复
热议问题