Cassandra returns variable results when looking for time series data

人盡茶涼 提交于 2019-12-12 04:34:14

问题


When I give this query in DataStax DevCenter, 2 rows are returned. The rows returned are foe Dec 30th as they should be.

SELECT * FROM abc.alerts_by_type_and_timestamp WHERE alert_type IN ('Permanent Fault', 'Temporary Fault') AND alert_timeStamp >= '2015-12-30T15:00+0000' AND alert_timeStamp <= '2015-12-31T15:00+0000'

But running in PreparedStatement like this

    SELECT * FROM abc.alerts_by_type_and_timestamp WHERE alert_type IN :alertTypes AND alert_timeStamp >= :minTimestamp AND alert_timeStamp <= :maxTimestamp

returns below 4 rows.

    17:52:48,587 INFO  [stdout] (default task-39) minTimestamp: 2015-12-30 15:00:00.0 - maxTimestamp : 2015-12-31 15:00:00.0
    17:52:50,904 INFO  [stdout] (default task-39) row : Row[Permanent Fault, Thu Dec 31 12:09:22 PST 2015, 2015, 365, .....]
    17:52:50,904 INFO  [stdout] (default task-39) row : Row[Permanent Fault, Thu Dec 31 12:08:14 PST 2015, 2015, 365, ....]
    17:52:50,905 INFO  [stdout] (default task-39) row : Row[Temporary Fault, Thu Dec 31 12:09:22 PST 2015, 2015, 365, ...]
    17:52:50,906 INFO  [stdout] (default task-39) row : Row[Temporary Fault, Thu Dec 31 12:08:14 PST 2015, 2015, 365, ...]

    17:52:50,906 INFO  [stdout] (default task-39) count is : 4

I believe this is due to time conversion. Data is stored as GMT, but somehow PreparedStatement is passing it in PST ??

How can I resolve this issue ?

I also tried this :

DateTime dateTime = new DateTime(minTimestamp.getTime(), DateTimeZone.UTC);
DateTime dateTime2 = new DateTime(maxTimestamp.getTime(), DateTimeZone.UTC);
BoundStatement stmtByAlertTypeAndTimestamp = pStmt.bind()
    .setTimestamp("minTimestamp", new Timestamp(dateTime.getMillis()))
    .setTimestamp("maxTimestamp", new Timestamp(dateTime2.getMillis()))
    .setList("Types", Types);

Printing out time on datetime :

 minTimestamp: 2016-07-19 17:00:00.0  
 maxTimestamp: 2016-07-26 00:00:00.0

Thanks


回答1:


You should change the file $CASSANDRA_HOME/pylib/cqlshlib/formatter.py

change function strftime to

def strftime(time_format, seconds):
    tzless_dt = datetime_from_timestamp(seconds)
    return tzless_dt.replace(tzinfo=pytz.utc).astimezone(pytz.timezone('Asia/Kolkata')).strftime(time_format)

and import pytz library

i done this to change the cqlsh output to IST. You can change timezone according to your need

Explanation: Actually cassandra store the data always in GMT and prepared statements take the time in local timezone (system timezone) so your results differ in both query.

There is one more work around that you can pass the data time in prepared statement with time zone then according to me it should work fine

Hope it helps



来源:https://stackoverflow.com/questions/40986740/cassandra-returns-variable-results-when-looking-for-time-series-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!