Cassandra timeout during read query at consistency ONE

前端 未结 2 1736
借酒劲吻你
借酒劲吻你 2021-01-17 02:12

I have a problem with the cassandra db and hope somebody can help me. I have a table “log”. In the log table, I have inserted about 10000 rows. Everything works fine. I can

2条回答
  •  悲哀的现实
    2021-01-17 03:13

    count() is a very costly operation, imagine Cassandra need to scan all the row from all the node just to give you the count. In small amount of rows if works, but on bigger data, you should use another approaches to avoid timeout.

    • First of all, we have to retrieve row by row to count amount and forgot about count(*)
    • We should make a several(dozens, hundreds?) queries with filtering by partition and clustering key and summ amount of rows retrieved by each query.
    • Here is good explanation what is clustering and partition keys In your case day - is partition key, composite key consists from two columns: date and ip.
    • It most likely impossible to do it with cqlsh commandline client, so you should write a script by yourself. Official drivers for popular programming languages: http://docs.datastax.com/en/developer/driver-matrix/doc/common/driverMatrix.html

    Example of one of such queries:

    select day, date, ip, iid, request, src, tid, txt from test.log where day='Saturday' and date='2017-08-12 00:00:00' and ip='127.0 0.1'

    Remarks:

    • If you need just to calculate count and nothing more, probably has a sense to google for tool like https://github.com/brianmhess/cassandra-count

    • If Cassandra refuses to run your query without ALLOW FILTERING that mean query is not efficient https://stackoverflow.com/a/38350839/2900229

提交回复
热议问题