Group Mongo documents by id and get the latest document by timestamp

前端 未结 3 1415
孤独总比滥情好
孤独总比滥情好 2021-01-13 09:44

Imagine we have the following set of documents stored in mongodb:

{ \"fooId\" : \"1\", \"status\" : \"A\", \"timestamp\" : ISODate(\"2016-01-01T00:00:00.000Z         


        
相关标签:
3条回答
  • 2021-01-13 09:52

    Though there is no direct way to bring back original documents and I don't see any value, but try following aggregation query:

    db.collection.aggregate([
       {$sort: {fooId:1, timestamp: -1}},
       {$group:{_id:"$fooId", doc:{$first:"$$ROOT"}}},
       {$project:{_id:0, doc:["$doc"]}}
    ]).forEach(function(item){
    
      printjson(item.doc[0]);
    
    });
    

    This query will emit:

    { 
        "_id" : ObjectId("570e76d5e94e6584078f02c4"), 
        "fooId" : "2", 
        "status" : "B", 
        "timestamp" : ISODate("2016-01-02T00:00:00.000+0000"), 
        "otherInfo" : "BAR"
    }
    { 
        "_id" : ObjectId("570e76d5e94e6584078f02c8"), 
        "fooId" : "3", 
        "status" : "D", 
        "timestamp" : ISODate("2016-01-04T00:00:00.000+0000"), 
        "otherInfo" : "BAR"
    }
    { 
        "_id" : ObjectId("570e76d5e94e6584078f02c2"), 
        "fooId" : "1", 
        "status" : "C", 
        "timestamp" : ISODate("2016-01-03T00:00:00.000+0000"), 
        "otherInfo" : "BAR"
    }
    
    0 讨论(0)
  • 2021-01-13 10:03

    If you are doing and aggregation, you need to do similar to SQL , which mean specify the aggregation operation per column, the only option you have is use the $$ROOT operator

    db.test.aggregate(
       [
        { $sort: { timestamp: 1 } },
         {
           $group:
             {
               _id: "$fooId",
               timestamp: { $last: "$$ROOT" }
             }
         }
       ]
    );
    

    But that will change the output a little bit

    { "_id" : "1", "timestamp" : { "_id" : ObjectId("570e6be3e81c8b195818e7fa"), 
      "fooId" : "1", "status" : "A", "timestamp" :ISODate("2016-01-01T00:00:00Z"), 
      "otherInfo" : "BAR" } }
    

    If you want to return the original document format, you probably need a $project stage after that

    0 讨论(0)
  • 2021-01-13 10:15

    You can use the $$ROOT system variable with the $last operator to return the last document.

    db.collectionName.aggregate([      
        { "$sort": { "timestamp": 1 } },     
        { "$group": { 
            "_id": "$fooId",   
            "last_doc": { "$last": "$$ROOT" } 
        }}
    ])
    

    Of course this will the last document for each group as a value of a field.

    {
            "_id" : "2",
            "doc" : {
                    "_id" : ObjectId("570e6df92f5bb4fcc8bb177e"),
                    "fooId" : "2",
                    "status" : "B",
                    "timestamp" : ISODate("2016-01-02T00:00:00Z")
            }
    }
    

    If you are not happy with that output then your best bet will be to add another $group stage to the pipeline when you simply return an array of those documents using the $push accumulator operator.

    db.collectionName.aggregate([      
        { "$sort": { "timestamp": 1 } },     
        { "$group": { 
            "_id": "$fooId",   
            "last_doc": { "$last": "$$ROOT" } 
        }},
        { "$group": { 
            "_id": null, 
            "result": { "$push": "$last_doc" } 
        }}
    
    ])
    
    0 讨论(0)
提交回复
热议问题