I\'ve got millions of items ordered by a precomputed score. Each item has many boolean attributes. Let says that there is about ten thousand possible attributes totally, each
Redis would be a perfect candidate for
Redis has a built in data structure that you can start from: Sorted Set
=> every member of a Sorted Set is associated with score. Which for example can be ranked by score with ZRANGEBYSCORE:
ZRANGEBYSCORE key min max [WITHSCORES] [LIMIT offset count]
I encourage you to look at Sorted Set commands and get a feel for Redis, as your problem (as it is stated) asks for it. You may of course keep as many attributes as you'd like within a single Set element.
As far as MongoDB, since you mentioned millions, unless you can bent incremental queries to work for your problem, I would not expect a sub second response.
As @nickdos mentioned Solr Relevancy is a quite powerful feature, but the number of attributes will be a problem, since it would need to keep all this attributes in memory for each item. Although a dozen for each may not be that bad => just try and see.
This is exactly what Mongo can deal with. The fact that your attributes are boolean type helps here. A possible schema is listed below:
[
{
true_tags:[attr1, attr2, attr3, ...],
false_tags: [attr4, attr5, attr6, ...]
},
]
Then we can index on true_tags and false_tags. And it should be efficient to search with $in, $all, ... query operators.
Mongodb can handle what you want, if you stored your objects like this
{ score:2131, attributes: ["attr1", "attr2", "attr3"], ... }
Then the following query will match all the items that have att1 and attr2
c = db.mycol.find({ attributes: { $all: [ "attr1", "attr2" ] } })
but this won't match it
c = db.mycol.find({ attributes: { $all: [ "attr1", "attr4" ] } })
the query returns a cursor, if you want this cursor to be sorted, then just add the sort parameters to the query like so
c = db.mycol.find({ attributes: { $all: [ "attr1", "attr2" ] }}).sort({score:1})
Have a look at Advanced Queries to see what's possible.
Appropriate indexes can be setup as follows
db.mycol.ensureIndex({attributes:1, score:1})
And you can get performance information using
db.mycol.find({ attributes: { $all: [ "attr1" ] }}).explain()
Mongo explains how many objects were scanned, how long the operation took and various other statistics.