mongodb get distinct records

杀马特。学长 韩版系。学妹 提交于 2019-11-26 09:42:15

问题


I am using mongoDB in which I have collection of following format.

{\"id\" : 1 , name : x  ttm : 23 , val : 5 }
{\"id\" : 1 , name : x  ttm : 34 , val : 1 }
{\"id\" : 1 , name : x  ttm : 24 , val : 2 }
{\"id\" : 2 , name : x  ttm : 56 , val : 3 }
{\"id\" : 2 , name : x  ttm : 76 , val : 3 }
{\"id\" : 3 , name : x  ttm : 54 , val : 7 }

On that collection I have queried to get records in descending order like this:

db.foo.find({\"id\" : {\"$in\" : [1,2,3]}}).sort(ttm : -1).limit(3)

But it gives two records of same id = 1 and I want records such that it gives 1 record per id.

Is it possible in mongodb?


回答1:


There is a distinct command in mongodb, that can be used in conjunction with a query. However, I believe this just returns a distinct list of values for a specific key you name (i.e. in your case, you'd only get the id values returned) so I'm not sure this will give you exactly what you want if you need the whole documents - you may require MapReduce instead.

Documentation on distinct: http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Distinct




回答2:


You want to use aggregation. You could do that like this:

db.test.aggregate([
    // each Object is an aggregation.
    {
        $group: {
            originalId: {$first: '$_id'}, // Hold onto original ID.
            _id: '$id', // Set the unique identifier
            val:  {$first: '$val'},
            name: {$first: '$name'},
            ttm:  {$first: '$ttm'}
        }

    }, {
        // this receives the output from the first aggregation.
        // So the (originally) non-unique 'id' field is now
        // present as the _id field. We want to rename it.
        $project:{
            _id : '$originalId', // Restore original ID.

            id  : '$_id', // 
            val : '$val',
            name: '$name',
            ttm : '$ttm'
        }
    }
])

This will be very fast... ~90ms for my test DB of 100,000 documents.

Example:

db.test.find()
// { "_id" : ObjectId("55fb595b241fee91ac4cd881"), "id" : 1, "name" : "x", "ttm" : 23, "val" : 5 }
// { "_id" : ObjectId("55fb596d241fee91ac4cd882"), "id" : 1, "name" : "x", "ttm" : 34, "val" : 1 }
// { "_id" : ObjectId("55fb59c8241fee91ac4cd883"), "id" : 1, "name" : "x", "ttm" : 24, "val" : 2 }
// { "_id" : ObjectId("55fb59d9241fee91ac4cd884"), "id" : 2, "name" : "x", "ttm" : 56, "val" : 3 }
// { "_id" : ObjectId("55fb59e7241fee91ac4cd885"), "id" : 2, "name" : "x", "ttm" : 76, "val" : 3 }
// { "_id" : ObjectId("55fb59f9241fee91ac4cd886"), "id" : 3, "name" : "x", "ttm" : 54, "val" : 7 }


db.test.aggregate(/* from first code snippet */)

// output
{
    "result" : [
        {
            "_id" : ObjectId("55fb59f9241fee91ac4cd886"),
            "val" : 7,
            "name" : "x",
            "ttm" : 54,
            "id" : 3
        },
        {
            "_id" : ObjectId("55fb59d9241fee91ac4cd884"),
            "val" : 3,
            "name" : "x",
            "ttm" : 56,
            "id" : 2
        },
        {
            "_id" : ObjectId("55fb595b241fee91ac4cd881"),
            "val" : 5,
            "name" : "x",
            "ttm" : 23,
            "id" : 1
        }
    ],
    "ok" : 1
}

PROS: Almost certainly the fastest method.

CONS: Involves use of the complicated Aggregation API. Also, it is tightly coupled to the original schema of the document. Though, it may be possible to generalize this.




回答3:


The issue is that you want to distill 3 matching records down to one without providing any logic in the query for how to choose between the matching results.

Your options are basically to specify aggregation logic of some kind (select the max or min value for each column, for example), or to run a select distinct query and only select the fields that you wish to be distinct.

querymongo.com does a good job of translating these distinct queries for you (from SQL to MongoDB).

For example, this SQL:

SELECT DISTINCT columnA FROM collection WHERE columnA > 5

Is returned as this MongoDB:

db.runCommand({
    "distinct": "collection",
    "query": {
        "columnA": {
            "$gt": 5
        }
    },
    "key": "columnA"
});



回答4:


I believe you can use aggregate like this

collection.aggregate({
   $group : {
        "_id" : "$id",
        "docs" : { 
            $first : { 
            "name" : "$name",
            "ttm" : "$ttm",
            "val" : "$val",
            }
        } 
    }
});



回答5:


If you want to write the distinct result in a file using javascript...this is how you do

cursor = db.myColl.find({'fieldName':'fieldValue'})

var Arr = new Array();
var count = 0;

cursor.forEach(

function(x) {

    var temp = x.id;    
var index = Arr.indexOf(temp);      
if(index==-1)
   {
     printjson(x.id);
     Arr[count] = temp;
         count++;
   }
})


来源:https://stackoverflow.com/questions/5089162/mongodb-get-distinct-records

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!