Cassandra cqlsh - how to show microseconds/milliseconds for timestamp columns?

前端 未结 5 1606
名媛妹妹
名媛妹妹 2020-12-02 17:28

I\'m inserting into a Cassandra table with timestamp columns. The data I have comes with microsecond precision, so the time data string looks like this:

2015

相关标签:
5条回答
  • 2020-12-02 17:58

    You can configure the output format of datetime objects in the .cassandra/cqlshrc file, using python's 'strftime' syntax.

    Unfortunately, the %f directive for microseconds (there does not seem to be a directive for milliseconds) does not work for older python versions, which means you have to fall back to the blobAsBigint(timestampAsBlob(date)) solution.

    0 讨论(0)
  • 2020-12-02 18:03

    I think by "microseconds" (e.g 03.234567) you mean "milliseconds" (e.g. (03.234).

    The issue here was a cqlsh bug that failed to support fractional seconds when dealing with timestamps.

    So, while your millisecond value was preserved in the actual persistence layer (cassandra), the shell (cqlsh) failed to display them.

    This was true even if you were to change time_format in .cqlshrc to display fractional seconds with an %f directive (e.g. %Y-%m-%d %H:%M:%S.%f%z). In this configuration cqlsh would render 3.000000 for our 3.234 value, since the issue was in how cqlsh loaded the datetime objects without loading the partial seconds.

    That all being said, this issue was fixed in CASSANDRA-10428, and released in Cassandra 3.4.

    0 讨论(0)
  • 2020-12-02 18:16

    It is impossible to show microseconds (1 millionth of a second) using the Cassandra datatype 'timestamp' because the greatest precision available for that datatype is milliseconds (1 thousandth of a second).

    http://docs.datastax.com/en/cql/3.1/cql/cql_reference/timestamp_type_r.html

    Values for the timestamp type are encoded as 64-bit signed integers representing a number of milliseconds since the standard base time known as the epoch

    0 讨论(0)
  • 2020-12-02 18:17

    Some related code:

    cqlsh> CREATE KEYSPACE udf
      WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3};
    
    cqlsh> USE udf;
    
    cqlsh:udf> CREATE OR REPLACE FUNCTION udf.timeuuid_as_us ( t timeuuid ) 
    RETURNS NULL ON NULL INPUT
    RETURNS bigint LANGUAGE JAVA AS '
      long msb = t.getMostSignificantBits();
      return
        ( ((msb >> 32) & 0x00000000FFFFFFFFL)
        | ((msb & 0x00000000FFFF0000L) << 16)
        | ((msb & 0x0000000000000FFFL) << 48)
        ) / 10
        - 12219292800000000L;
    ';
    
    cqlsh:udf> SELECT
      toUnixTimestamp(now())    AS now_ms
    , udf.timeuuid_as_us(now()) AS now_us
    FROM system.local;
    
     now_ms        | now_us
    ---------------+------------------
     1525995892841 | 1525995892841000
    
    0 讨论(0)
  • 2020-12-02 18:25

    In an effort to answer your questions, I did a little digging on this one.

    1. Does Cassandra capture microseconds with timestamp data type?

    Microseconds no, milliseconds yes. If I create your table, insert a row, and try to query it by the truncated time, it doesn't work:

    aploetz@cqlsh:stackoverflow> INSERT INTO data (datetime, id, type, data) 
    VALUES ('2015-02-16T18:00:03.234+00:00','B26354','Blade Runner','Deckard- Filed and monitored.');
    aploetz@cqlsh:stackoverflow> SELECT * FROM data 
    WHERE id='B26354' AND type='Blade Runner' AND datetime='2015-02-16 12:00:03-0600';
    
     id | type | datetime | data
    ----+------+----------+------
    
    (0 rows)
    

    But when I query for the same id and type values while specifying milliseconds:

    aploetz@cqlsh:stackoverflow> SELECT * FROM data 
    WHERE id='B26354' AND type='Blade Runner' AND datetime='2015-02-16 12:00:03.234-0600';
    
     id     | type         | datetime                 | data
    --------+--------------+--------------------------+-------------------------------
     B26354 | Blade Runner | 2015-02-16 12:00:03-0600 | Deckard- Filed and monitored.
    
    (1 rows)
    

    So the milliseconds are definitely there. There was a JIRA ticket created for this issue (CASSANDRA-5870), but it was resolved as "Won't Fix."

    1. How can I see that with cqlsh to verify?

    One possible way to actually verify that the milliseconds are indeed there, is to nest the timestampAsBlob() function inside of blobAsBigint(), like this:

    aploetz@cqlsh:stackoverflow> SELECT id, type, blobAsBigint(timestampAsBlob(datetime)), 
    data FROM data;
    
     id     | type         | blobAsBigint(timestampAsBlob(datetime)) | data
    --------+--------------+-----------------------------------------+-------------------------------
     B26354 | Blade Runner |                           1424109603234 | Deckard- Filed and monitored.
    
    (1 rows)
    

    While not optimal, here you can clearly see the millisecond value of "234" on the very end. This becomes even more apparent if I add a row for the same timestamp, but without milliseconds:

    aploetz@cqlsh:stackoverflow> INSERT INTO data (id, type, datetime, data)
    VALUES ('B25881','Blade Runner','2015-02-16T18:00:03+00:00','Holden- Fine as long as nobody unplugs him.');
    aploetz@cqlsh:stackoverflow> SELECT id, type, blobAsBigint(timestampAsBlob(datetime)), 
                     ...     data FROM data;
    
     id     | type         | blobAsBigint(timestampAsBlob(datetime)) | data
    --------+--------------+-----------------------------------------+---------------------------------------------
     B25881 | Blade Runner |                           1424109603000 | Holden- Fine as long as nobody unplugs him.
     B26354 | Blade Runner |                           1424109603234 |               Deckard- Filed and monitored.
    
    (2 rows)
    
    0 讨论(0)
提交回复
热议问题