问题
I have a table where I am logging user activity performed on my website. My table structure looks like:
CREATE TABLE key_space.log (
id uuid,
time bigint,
ip text,
url text,
user_id int,
PRIMARY KEY (id, time)
) WITH CLUSTERING ORDER BY (time DESC)
Now I want to fetch all the records which came in last 5 minutes.
For doing the same, I am using
select * from key_space.log where
time>current_timestamp - 5 minutes ALLOW FILTERING;
But this query is not returning any result & i am getting timedoutexception error. How to solve this problem? Any help on this would be really appreciated.
回答1:
Your table has id as the partition key and time as the clustering key. In order to run range query on clustering key, you need to specify the partition key also. So the query should be
select * from key_space.log where id="xyz" and time>current_timestamp - 5 minutes ALLOW FILTERING;
This answer describes the different Cassandra keys very well.
回答2:
As with all Cassandra models, you'll need to start by building a table specifically designed to support that query. Even if you could make it work with your current table, it would have to scan every node in the cluster, which would probably time-out (as you are seeing).
One way to do this will be to use a time "bucket" as a partition key. If you just care about records for the last 5 minutes, then "day" should work, as long as you don't get millions of new records per day. If you do, then you'll need a smaller time component for your "bucket."
CREATE TABLE log_by_day (
id uuid,
day bigint,
time bigint,
ip text,
url text,
user_id int,
PRIMARY KEY (day, time, id)
) WITH CLUSTERING ORDER BY (time DESC, id ASC);
Now a query like this will work:
aaron@cqlsh:stackoverflow> SELECT day,time,id,user_id FROM log_by_day
WHERE day=201920 AND time > 1563635871941;
day | time | id | user_id
--------+---------------+--------------------------------------+---------
201920 | 1563635872259 | 7fef03da-6c23-4bf2-9e98-fd126ab17944 | 1234
201920 | 1563635872259 | 9a0c49ce-5ad2-45c5-b570-cd9de1c060d1 | 4607
201920 | 1563635872209 | 9227166e-cda2-4909-b8ac-4168922a0128 | 2112
(3 rows)
Pro-tip: Partitioning on a unique id
is great for data distribution, but doesn't give you much in the way of query flexibility.
来源:https://stackoverflow.com/questions/57106586/range-query-on-clustering-key