Timestamp / date as key for cassandra column family / hector

浪尽此生 提交于 2019-12-01 08:23:56

问题


I have to create and query a column family with composite key as [timestamp,long]. Also, while querying I want to fire range query for timestamp (like timestamp between xxx and yyy) Is this possible ?

Currently I am doing something really funny (Which I know its not correct). I create keys with timestamp string for given range and concatenate with long.

like ,
1254345345435-1234
3423432423432-1234
1231231231231-9999

and pass set of keys to hector api. (so if i have date range for 1 month and I want every minute data, i create 30 * 24 * 60 * [number of secondary key - long])

I can solve concatenation issue with composite key. But query part is what I am trying to understand.

As far as I understood, As we are using RandomPartitioner we cannot really query based on range as keys are MD5 checksum. Whats ideal design for this kind of use case ?

my schema and requirements are as follows : (actual csh)

    CREATE TABLE report(
        ts timestamp,
        user_id long,
        svc1 long,
        svc2 long,
        svc3 long,
        PRIMARY KEY(ts, user_id));

select from report where ts between (123445345435 and 32423423424) and user_id is in (123,567,987)

回答1:


You cannot do range queries on the first component of a composite key. Instead, you should write a sentinel value such as a daystamp (the unix epoch at midnight on the current day) as the key, then write a composite column as timestamp:long. This way you can provide the keys that comprise your range, and slice on the timestamp component of the composite column.




回答2:


Denormalize! You must model your schema in a manner that will enable the types of queries you wish to perform. We create a reverse (aka inverted, inverse) index for such scenarios.

CREATE TABLE report(
    KEY uuid PRIMARY KEY,
    svc1 bigint,
    svc2 bigint,
    svc3 bigint
);

CREATE TABLE ReportsByTime(
    KEY ascii PRIMARY KEY
) with default_validation=uuid AND comparator=uuid;

CREATE TABLE ReportsByUser(
    KEY bigint PRIMARY KEY
)with default_validation=uuid AND comparator=uuid;

See here for a nice explanation. What you are doing now is generating your own ascii key in the times table, to enable yourself to perform the range slice query you want - it doesn't have to be ascii though just something you can use to programmatically generate your own slice keys with.

You can use this approach to facilitate all of your queries, this likely isn't going to suit your application directly but the idea is the same. You can squeeze more out of this by adding meaningful values to the column keys of each table above.

cqlsh:tester> select * from report;
 KEY                                  | svc1 | svc2 | svc3
--------------------------------------+------+------+------
 1381b530-1dd2-11b2-0000-242d50cf1fb5 |  332 |  333 |  334
 13818e20-1dd2-11b2-0000-242d50cf1fb5 |  222 |  223 |  224
 13816710-1dd2-11b2-0000-242d50cf1fb5 |  112 |  113 |  114


cqlsh:tester> select * from times;
 KEY,1212051037 | 13818e20-1dd2-11b2-0000-242d50cf1fb5,13818e20-1dd2-11b2-0000-242d50cf1fb5 | 1381b530-1dd2-11b2-0000-242d50cf1fb5,1381b530-1dd2-11b2-0000-242d50cf1fb5
 KEY,1212051035 | 13816710-1dd2-11b2-0000-242d50cf1fb5,13816710-1dd2-11b2-0000-242d50cf1fb5 | 13818e20-1dd2-11b2-0000-242d50cf1fb5,13818e20-1dd2-11b2-0000-242d50cf1fb5
 KEY,1212051036 | 13818e20-1dd2-11b2-0000-242d50cf1fb5,13818e20-1dd2-11b2-0000-242d50cf1fb5

cqlsh:tester> select * from users;
 KEY         | 13816710-1dd2-11b2-0000-242d50cf1fb5 | 13818e20-1dd2-11b2-0000-242d50cf1fb5
-------------+--------------------------------------+--------------------------------------
 23123123231 | 13816710-1dd2-11b2-0000-242d50cf1fb5 | 13818e20-1dd2-11b2-0000-242d50cf1fb5



回答3:


Why don't you use wide rows, where Key is timestamp and Column Name as Long-Value then you can pass multiple key's (timestamp's) to getKeySlice and select multiple column's to withColumnSlice by there name (which is id).

As I don't know what is column name and value, I feel this can help you. Can you provide more details of your column family definition.



来源:https://stackoverflow.com/questions/13700288/timestamp-date-as-key-for-cassandra-column-family-hector

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!