How to use Distinct, Sort, limit with mongodb

时间秒杀一切 提交于 2020-01-13 06:01:44

问题


I have a document structure {'text': 'here is text', 'count' : 13, 'somefield': value}

Collection has some thousands of record, and text key value may be repeated many time, I want to find distinct text with highest count value,along with that whole document should be returned , I am able to sort them in descending order.

distinct returns unique value in a list.

I want to use all three functions and document has to be returned, I am still learning and not covered mapreduce.


回答1:


Can you please clarify exactly what you would like to do? Do you want to return documents with unique "text" values with the highest "count" value?

For example, given the collection:

> db.text.find({}, {_id:0})
{ "text" : "here is text", "count" : 13, "somefield" : "value" }
{ "text" : "here is text", "count" : 12, "somefield" : "value" }
{ "text" : "here is text", "count" : 10, "somefield" : "value" }
{ "text" : "other text", "count" : 4, "somefield" : "value" }
{ "text" : "other text", "count" : 3, "somefield" : "value" }
{ "text" : "other text", "count" : 2, "somefield" : "value" }
>
(I have omitted _id values for brevity)

Would you like to return only the documents that contain unique text with the highest 'count' value?

{ "text" : "here is text", "count" : 13, "somefield" : "value" }

and

{ "text" : "other text", "count" : 4, "somefield" : "value" }

One way to do this is with the $group and $max functions in the new aggregation framework. The documentation on $group may be found here: http://docs.mongodb.org/manual/aggregation/

> db.text.aggregate({$group : {_id:"$text", "maxCount":{$max:"$count"}}})
{
    "result" : [
        {
            "_id" : "other text",
            "maxCount" : 4
        },
        {
            "_id" : "here is text",
            "maxCount" : 13
        }
    ],
    "ok" : 1
}

As you can see, the above does not return the original documents. If the original documents are desired, a query may then be done to find documents matching the unique text and count values.

As an alternative, you can first do run the 'distinct' command to return an array of all the distinct values and then run a query for each value with sort and limit to return the document with the highest value of 'count'. The sort() and limit() methods are explained in the "Cursor Methods" section of the "Advanced Queries" documentation: http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-CursorMethods

> var values = db.runCommand({distinct:"text", key:"text"}).values
> values
[ "here is text", "other text" ]
> for(v in values){var c = db.text.find({"text":values[v]}).sort({count:-1}).limit(1); c.forEach(printjson);}
{
    "_id" : ObjectId("4f7b50b2df77a5e0fd4ccbf1"),
    "text" : "here is text",
    "count" : 13,
    "somefield" : "value"
}
{
    "_id" : ObjectId("4f7b50b2df77a5e0fd4ccbf4"),
    "text" : "other text",
    "count" : 4,
    "somefield" : "value"
}

It is unclear if this is exactly what you are trying to do, but I hope that it will at least give you some ideas to get started. If I have misunderstood, please explain in more detail the exact operation that you would like to perform, and hopefully I or another member of the Community will be able to help you out. Thanks.



来源:https://stackoverflow.com/questions/9998915/how-to-use-distinct-sort-limit-with-mongodb

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!