ORDER BY reloaded, cassandra

A given column family I would like to sort and to this I am trying to create a table with the option CLUSTERING ORDER BY. I always encounter the following errors:

1.) Variant A resulting in Bad Request: Missing CLUSTERING ORDER for column userid Statement:

CREATE TABLE test.user (
  userID timeuuid,
  firstname varchar,
  lastname varchar,
  PRIMARY KEY (lastname, userID)
)WITH CLUSTERING ORDER BY (lastname desc);

2.) Variant B resulting in Bad Request: Only clustering key columns can be defined in CLUSTERING ORDER directive Statement:

CREATE TABLE test.user (
  userID timeuuid,
  firstname varchar,
  lastname varchar,
  PRIMARY KEY (lastname, userID)
)WITH CLUSTERING ORDER BY (lastname desc, userID asc);

As far as I can see in the manual this is the correct syntax for creating a table for which I would like to run queries as "SELECT .... FROM user WHERE ... ORDER BY lastname". How could I achieve this? (The column 'lastname' I would like to keep as the first part of the primary key, so that I could use it in delete statements with the WHERE-clause.)

Thanks a lot, Tamas

Clustering would be limited to whats defined in partitioning key, in your case (lastName + userId). So cassandra would store result in sorted order whose (lastName+userId) combination. Thats why u nned to give both for retrieval purpose. Its still not useful schema if you want to sort all data in table as last name as userId is unique(timeuuid) so clustering key would be of no use.

CREATE TABLE test.user (
  userID timeuuid,
  firstname varchar,
  lastname varchar,
  bucket int,
  PRIMARY KEY (bucket)
)WITH CLUSTERING ORDER BY (lastname desc);

Here if u provide buket value say 1 for all user records then , all user would go in same bucket and hense it would retrieve all rows in sorted order of last name. (By no mean this is a good design, just to give you an idea).

Revised :

CREATE TABLE user1 (
  userID uuid,
  firstname varchar,
  lastname varchar,
  bucket int,
  PRIMARY KEY ((bucket), lastname,userID)
)WITH CLUSTERING ORDER BY (lastname desc);

You can only specify clustering order on your clustering keys.

PRIMARY KEY (lastname, userID)
)WITH CLUSTERING ORDER BY (lastname desc);

In your first example, your only clustering key is userID. Thus, it is the only valid entry for CLUSTERING ORDER BY.

PRIMARY KEY (lastname, userID)
)WITH CLUSTERING ORDER BY (lastname desc, userID asc);

The second example fails because you are specifying your partition key in CLUSTERING ORDER BY, and that's not going to work either.

Cassandra works by ordering CQL rows according to clustering keys, but only when a partition key is specified. This is because the whole idea of Cassandra wide-row modeling is to query by partition key, and read a series of ordered rows in one query operation.

I would like to run queries as "SELECT .... FROM user WHERE ... ORDER BY lastname".

Given this statement, I am going to suggest that you need another column in this model before it will work the way you want. What you need is an appropriate partition key for your users table. Say...like group. With your users partitioned by group, and clustered by lastname, your definition would look something like this:

CREATE TABLE test.usersbygroup (
  userID timeuuid,
  firstname varchar,
  lastname varchar,
  group text,
  PRIMARY KEY (group,lastname)
)WITH CLUSTERING ORDER BY (lastname desc);

Then, this query will work, returning users (in this case) who are fans of the show "Firefly," ordered by lastname (descending):

SELECT * FROM usersbygroup WHERE group='Firefly Fans';

Read through this DataStax doc on Compound Keys and Clustering to get a better understanding.

NOTE: You don't need to specify ORDER BY in your SELECT. The rows will come back ordered by their clustering key(s), and ORDER BY cannot change that. All ORDER BY can really do, is alter the sort direction (DESCending vs. ASCending).

来源：https://stackoverflow.com/questions/28753656/order-by-reloaded-cassandra

标签

cassandra

cassandra-cli