问题
I have Cassandra 2.x cluster with 3 nodes and the db scheme like this:
cqlsh> CREATE KEYSPACE test_ks WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 3} AND durable_writes = true;
cqlsh> CREATE TABLE IF NOT EXISTS test_ks.test_cf (
... time timeuuid,
... user_id varchar,
... info varchar,
... PRIMARY KEY (time, user_id)
... ) WITH compression = {'sstable_compression': 'LZ4Compressor'} AND compaction = {'class': 'LeveledCompactionStrategy'};
Lets add some data (wait some time betweeb inserts):
cqlsh> INSERT INTO test_ks.test_cf (time, user_id, info) VALUES (now(), 'user1', 'pythonista');
cqlsh> INSERT INTO test_ks.test_cf (time, user_id, info) VALUES (now(), 'user1', 'mr. Haskell');
Lets look at our data:
cqlsh> SELECT dateOf(time), user_id, info FROM test_ks.test_cf;
dateOf(time) | user_id | info
--------------------------+---------+-------------
2014-06-24 16:00:31+0700 | user1 | mr. Haskell
2014-06-24 15:59:32+0700 | user1 | pythonista
(2 rows)
I get some strange results while querying test_cf CF:
cqlsh> SELECT dateOf(time) FROM test_ks.test_cf WHERE user_id='user1' AND token(time) >= token(maxTimeuuid('2014-06-24 16:00:31+0700')) ALLOW FILTERING;
dateOf(time)
--------------------------
2014-06-24 15:59:32+0700
(1 rows)
cqlsh> SELECT dateOf(time) FROM test_ks.test_cf WHERE user_id='user1' AND token(time) >= token(maxTimeuuid('2014-06-24 16:00:32+0700')) ALLOW FILTERING;
dateOf(time)
--------------------------
2014-06-24 15:59:32+0700
(1 rows)
cqlsh> SELECT dateOf(time) FROM test_ks.test_cf WHERE user_id='user1' AND token(time) >= token(maxTimeuuid('2014-06-24 16:00:33+0700')) ALLOW FILTERING;
dateOf(time)
--------------------------
2014-06-24 16:00:31+0700
2014-06-24 15:59:32+0700
(2 rows)
As you can see comparison gives the wrong results, although, timeuuid
must not be greater than maxTimeuuid
and must be greater than 'minTimeuuid' (for a same datetime, of course =) ). Could somebody explain to me this strange behavior?
TIA!
回答1:
First off, your primary key is backwards—if you want to perform queries like this, your CREATE TABLE
statement should look like this:
CREATE TABLE IF NOT EXISTS test_ks.test_cf (
time timeuuid,
user_id varchar,
info varchar,
PRIMARY KEY (user_id, time)
) WITH compression = {'sstable_compression': 'LZ4Compressor'}
AND compaction = {'class': 'LeveledCompactionStrategy'};
That makes user_id
the partition key and time
the clustering column, which fits your query pattern.
With this change, you no longer need to use the TOKEN
function, which is needed to make range queries over partition keys, but is meaningless for clustering columns (and generally does not correspond to semantic ordering of column values). Since time
is now a clustering column, you just need to do a normal comparison:
SELECT dateOf(time)
FROM test_ks.test_cf
WHERE user_id='user1'
AND time >= maxTimeuuid('2014-06-24 16:00:31+0700');
You can also drop the ALLOW FILTERING
, since you're now doing a standard range slice rather than the token comparison, which required scanning all the rows and making the comparison.
来源:https://stackoverflow.com/questions/24383015/strange-behavior-of-timeuuid-comparison