问题
I have thousands of documents in MongoDB with some of the sample as below:
{"title":"Foo", "hash": "1234567890abcedf", "num_sold": 49,
"created": "2013-03-09 00:00:00"}
{"title":"Bar", "hash": "1234567890abcedf", "num_sold": 55,
"created": "2013-03-11 00:00:00"}
{"title":"Baz", "hash": "1234567890abcedf", "num_sold": 55,
"created": "2013-03-10 00:00:00"}
{"title":"Spam", "hash": "abcedef1234567890", "num_sold": 20,
"created": "2013-03-11 00:00:00"}
{"title":"Eggs", "hash": "abc1234567890def", "num_sold": 20,
"created": "2013-03-11 00:00:00"}
Is it possible to select all documents with distinct hash
which has the max of num_sold
and if there is more than one document with same num_sold
, select the latest document from the created
field.
I use PyMongo for the client.
回答1:
I am no Python expert so I will write this in JavaScript. You can do this with the aggregation framework using the $sort
, $group
and $first
opreators:
db.col.aggregate([
{$sort: {created:-1}},
{$group: {_id: '$hash', num_sold: {$first: '$num_sold'}, _id_seen: {$first: '$_id'}}}
])
Essentially what I do is sort the incoming documents by their created date DESC and then I group on hash, concatenating two duplicate hashes and then I get the first result of the sorted group, which should be the newest document.
References:
- http://docs.mongodb.org/manual/reference/aggregation/first/#_S_first
来源:https://stackoverflow.com/questions/15334408/find-distinct-documents-with-max-value-of-a-field-in-mongodb