Cassandra sorting results by count

前端未结

关注

 2  1266

I am recording data on users searching for various keywords. What I\'d like to produce is a report of all of the unique keywords that the users have searched for, sorted in

相关标签:

2条回答

广开言路

2021-01-19 06:01

According to the eBay tech blog, it's not unusual to store your counter values in the key itself. So to store the number of times, Bob, Ken, and Jimmy logged into a website, a single row would look as follows:

logins: [(0001_Bob,''), (0002_Bob, ''), ..., (0010_Ken, ''), (0012_Jimmy, ''), ...]

Notice that your keys will automatically sort themselves with the highest count at the tail-end and this is close to a constant time look-up.

Note that everytime your user logs-in, a new column key is created. You'd have to keep track of the number of log-ins in another row so that you have a fast look-up for how many log-ins have occurred so far and what integer value your next key should have:

login_count: [(Bob, 2), (Ken, 10), (Jimmy, 10), ...]

0 讨论(0)
发布评论:

提交评论
- 加载中...
情话喂你

2021-01-19 06:13

You could use each keyword as a row key, and use a counter column for each row to track the number of searches. You could then produce a report by scanning over every row and reading the counters. Cassandra won't sort the results (assuming you use the default RandomPartitioner rather than an OrderPreservingPartitioner), but given that there will presumably only be a few tens of thousands of keywords, you can easily sort them at the client.

0 讨论(0)
发布评论:

提交评论
- 加载中...