问题
I have to create and query a column family with composite key as [timestamp,long]. Also, while querying I want to fire range query for timestamp (like timestamp between xxx and yyy) Is this possible ?
Currently I am doing something really funny (Which I know its not correct). I create keys with timestamp string for given range and concatenate with long.
like ,
1254345345435-1234
3423432423432-1234
1231231231231-9999
and pass set of keys to hector api. (so if i have date range for 1 month and I want every minute data, i create 30 * 24 * 60 * [number of secondary key - long])
I can solve concatenation issue with composite key. But query part is what I am trying to understand.
As far as I understood, As we are using RandomPartitioner we cannot really query based on range as keys are MD5 checksum. Whats ideal design for this kind of use case ?
my schema and requirements are as follows : (actual csh)
CREATE TABLE report(
ts timestamp,
user_id long,
svc1 long,
svc2 long,
svc3 long,
PRIMARY KEY(ts, user_id));
select from report where ts between (123445345435 and 32423423424) and user_id is in (123,567,987)
回答1:
You cannot do range queries on the first component of a composite key. Instead, you should write a sentinel value such as a daystamp (the unix epoch at midnight on the current day) as the key, then write a composite column as timestamp:long. This way you can provide the keys that comprise your range, and slice on the timestamp component of the composite column.
回答2:
Denormalize! You must model your schema in a manner that will enable the types of queries you wish to perform. We create a reverse (aka inverted, inverse) index for such scenarios.
CREATE TABLE report(
KEY uuid PRIMARY KEY,
svc1 bigint,
svc2 bigint,
svc3 bigint
);
CREATE TABLE ReportsByTime(
KEY ascii PRIMARY KEY
) with default_validation=uuid AND comparator=uuid;
CREATE TABLE ReportsByUser(
KEY bigint PRIMARY KEY
)with default_validation=uuid AND comparator=uuid;
See here for a nice explanation. What you are doing now is generating your own ascii
key in the times
table, to enable yourself to perform the range slice query you want - it doesn't have to be ascii
though just something you can use to programmatically generate your own slice keys with.
You can use this approach to facilitate all of your queries, this likely isn't going to suit your application directly but the idea is the same. You can squeeze more out of this by adding meaningful values to the column keys of each table above.
cqlsh:tester> select * from report;
KEY | svc1 | svc2 | svc3
--------------------------------------+------+------+------
1381b530-1dd2-11b2-0000-242d50cf1fb5 | 332 | 333 | 334
13818e20-1dd2-11b2-0000-242d50cf1fb5 | 222 | 223 | 224
13816710-1dd2-11b2-0000-242d50cf1fb5 | 112 | 113 | 114
cqlsh:tester> select * from times;
KEY,1212051037 | 13818e20-1dd2-11b2-0000-242d50cf1fb5,13818e20-1dd2-11b2-0000-242d50cf1fb5 | 1381b530-1dd2-11b2-0000-242d50cf1fb5,1381b530-1dd2-11b2-0000-242d50cf1fb5
KEY,1212051035 | 13816710-1dd2-11b2-0000-242d50cf1fb5,13816710-1dd2-11b2-0000-242d50cf1fb5 | 13818e20-1dd2-11b2-0000-242d50cf1fb5,13818e20-1dd2-11b2-0000-242d50cf1fb5
KEY,1212051036 | 13818e20-1dd2-11b2-0000-242d50cf1fb5,13818e20-1dd2-11b2-0000-242d50cf1fb5
cqlsh:tester> select * from users;
KEY | 13816710-1dd2-11b2-0000-242d50cf1fb5 | 13818e20-1dd2-11b2-0000-242d50cf1fb5
-------------+--------------------------------------+--------------------------------------
23123123231 | 13816710-1dd2-11b2-0000-242d50cf1fb5 | 13818e20-1dd2-11b2-0000-242d50cf1fb5
回答3:
Why don't you use wide rows, where Key is timestamp and Column Name as Long-Value then you can pass multiple key's (timestamp's) to getKeySlice and select multiple column's to withColumnSlice by there name (which is id).
As I don't know what is column name and value, I feel this can help you. Can you provide more details of your column family definition.
来源:https://stackoverflow.com/questions/13700288/timestamp-date-as-key-for-cassandra-column-family-hector