Cassandra timing out when queried for key that have over 10,000 rows even after giving timeout of 10sec

和自甴很熟 提交于 2019-12-08 09:28:50

问题


Im using a DataStax Community v 2.1.2-1 (AMI v 2.5) with preinstalled default settings. And i have a table :

CREATE TABLE notificationstore.note (
  user_id text,
  real_time timestamp,
  insert_time timeuuid,
  read boolean,
  PRIMARY KEY (user_id, real_time, insert_time))
WITH CLUSTERING ORDER BY (real_time DESC, insert_time ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}
AND **default_time_to_live** = 20160

The other configurations are:

I have 2 nodes. on m3.large having 1 x 32 (SSD). Im facing the issue of timeouts even if consistency is set to ONE on this particular table.

  1. I increased the heap space to 3gb [ram size of 8gb]
  2. I increased the read timeout to 10 secs.
    select count (*) from note where user_id = 'xxx' limit 2; // errors={}, last_host=127.0.0.1.

I am wondering if the problem could be with time to live? or is there any other configuration any tuning that matters for this.

The data in the database is pretty small.
Also this problem occurs not as soon as you insert. This happens after some time (more than 6 hours)

Thanks.


回答1:


[Copying my answer from here because it's the same environment/problem: amazon ec2 - Cassandra Timing out because of TTL expiration.]

You're running into a problem where the number of tombstones (deleted values) is passing a threshold, and then timing out.

You can see this if you turn on tracing and then try your select statement, for example:

cqlsh> tracing on;
cqlsh> select count(*) from test.simple;

 activity                                                                        | timestamp    | source       | source_elapsed
---------------------------------------------------------------------------------+--------------+--------------+----------------
...snip...
 Scanned over 100000 tombstones; query aborted (see tombstone_failure_threshold) | 23:36:59,324 |  172.31.0.85 |         123932
                                                    Scanned 1 rows and matched 1 | 23:36:59,325 |  172.31.0.85 |         124575
                           Timed out; received 0 of 1 responses for range 2 of 4 | 23:37:09,200 | 172.31.13.33 |       10002216

You're kind of running into an anti-pattern for Cassandra where data is stored for just a short time before being deleted. There are a few options for handling this better, including revisiting your data model if needed. Here are some resources:

  • The cassandra.yaml configuration file - See section on tombstone settings
  • Cassandra anti-patterns: Queues and queue-like datasets
  • About deletes

For your sample problem, I tried lowering the gc_grace_seconds setting to 300 (5 minutes). That causes the tombstones to be cleaned up more frequently than the default 10 days, but that may or not be appropriate based on your application. Read up on the implications of deletes and you can adjust as needed for your application.



来源:https://stackoverflow.com/questions/27340812/cassandra-timing-out-when-queried-for-key-that-have-over-10-000-rows-even-after

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!