I have collection in which documents are like:
{
_id: ObjectId(),
user: ObjectId(),
studentName: Str
This is a good example of need to group data by specific filed (createdAt) and then compare result set match criteria.
Code:
db.student.aggregate([{
$group : {
_id : "$user",
created : {
$max : "$createdAt"
},
documents : {
$push : "$$ROOT"
}
}
}, {
$project : {
_id : 0,
documents : {
$filter : {
input : "$documents",
as : "item",
cond : {
$eq : ["$$item.createdAt", "$created"]
}
}
}}
}, {
$match : {
"documents.isAbandoned" : true
}},
{ $unwind : "$documents" },
{
$project : {
_id : "$documents._id",
user : "$documents.user",
studentName : "$documents.studentName",
createdAt : "$documents.createdAt",
isAbandoned : "$documents.isAbandoned",
}}
])
The best way to do this is using the aggregation framework. You need to $group your documents by "user" and return the last document for each user using the $last accumulator operator but for this to work, you need a preliminary sorting stage using the $sort aggregation pipeline operator. To sort your documents, you need to consider both the "createdAt" field and the "user" field.
The last stage in the pipeline is the $match stage where you select only those last documents where "isAbandoned" equals true
.
db.students.aggregate([
{ "$sort": { "user": 1, "createdAt": 1 } },
{ "$group": {
"_id": "$user",
"last": { "$last": "$$ROOT" }
}},
{ "$match": { "last.isAbandoned": true } }
])
which returns something like this:
{
"_id" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"last" : {
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}
}
To get the expected result, we need to use the $replaceRoot pipeline operator starting from verion 3.4 to promote the embedded document to the top level
{
$replaceRoot: { newRoot: "$last" }
}
In older version, you need to use the $project aggregation pipeline operation to reshape our documents. So if we extend our pipeline with the following stage:
{
"$project": {
"_id": "$last._id",
"user": "$last.user",
"studentName": "$last.studentName",
"createdAt": "$last.createdAt",
"isAbandoned": "$last.isAbandoned"
}}
it produces the expected output:
{
"_id" : ObjectId("56cee51503b7cb7b0eda9c4c"),
"user" : ObjectId("56c85244bd5f92cd78ae4bc1"),
"studentName" : "Rajeev",
"createdAt" : ISODate("2016-02-25T11:27:17.281Z"),
"isAbandoned" : true
}