how to filter cassandra query by a field in user defined type

后端 未结 1 1629
醉酒成梦
醉酒成梦 2020-12-04 01:47

how to filter cassandra query by user defined type field? i want to create people table in my cassandra database so i create this user-defined-type in my cassandra database.

相关标签:
1条回答
  • 2020-12-04 02:42

    Short answer: you can use secondary indexes to query by fullname UDT. But you cannot query by only a part of your UDT.

    // create table, type and index
    create type fullname ( firstname text, lastname text );
    create table people ( id UUID primary key, name frozen <fullname> );
    create index fname_index on your_keyspace.people (name);
    
    // insert some data into it
    insert into people (id, name) values (now(), {firstname: 'foo', lastname: 'bar'});
    insert into people (id, name) values (now(), {firstname: 'baz', lastname: 'qux'});
    
    // query it by fullname
    select * from people where name = { firstname: 'baz', lastname: 'qux' };
    
    // the following will NOT work:
    select * from people where name = { firstname: 'baz'};
    

    The reason for such behaviour is a way C* secondary indexes are implemented. In general, it's just another hidden table maintained by C*, in your case defined as:

    create table fname_index (name frozen <fullname> primary key, id uuid);
    

    Actually your secondary and primary keys are swapped in this table. So your case is reduced to a more general question 'why can't I query by only a part of PK?':

    • the whole PK value (firstname+lastname) is hashed, the resulting number defines the partition to store your row.
    • for that partition your row is appended to a memtable (and later flushed on disk to SSTable, a file sorted by key)
    • when you want to query only by part of PK (like by firstname only), C* doesn't able to guess the partition to look for (as it doesn't able to compute the hashcode for the whole fullname as lastname is unknown), as your match can be anywhere in any partition requiring full-table scan. C* explicitly forbids these scans, so you have no choice :)

    Suggested solutions:

    • split your UDT to essential parts like firstname and lastname and have secondary indexes on it.
    • use Cassandra 3.0 with materialized views feature (actually force cassandra to maintain a custom index for part of your UDT)
    • revisit your data model to be less strict (when no one forces you to use UDTs where they are not helpful)
    0 讨论(0)
提交回复
热议问题