I have a collection that is log of activity on objects like this:
{
\"_id\" : ObjectId(\"55e3fd1d7cb5ac9a458b4567\"),
\"object_id\" : \"1\",
\"ac
The most "performant" way to do this is to skip the $unwind altogther and simply $group to count. Essentially "filter" arrays get the $size of the results to $sum:
db.objects.aggregate([
{ "$match": {
"createddate": {
"$gte": ISODate("2015-08-30T00:00:00.000Z")
},
"activity.action": "test_action"
}},
{ "$group": {
"_id": null,
"count": {
"$sum": {
"$size": {
"$setDifference": [
{ "$map": {
"input": "$activity",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.action", "test_action" ] },
"$$el",
false
]
}
}},
[false]
]
}
}
}
}}
])
Future releases of MongoDB will have $filter
, which makes this much more simple:
db.objects.aggregate([
{ "$match": {
"createddate": {
"$gte": ISODate("2015-08-30T00:00:00.000Z")
},
"activity.action": "test_action"
}},
{ "$group": {
"_id": null,
"count": {
"$sum": {
"$size": {
"$filter": {
"input": "$activity",
"as": "el",
"cond": {
"$eq": [ "$$el.action", "test_action" ]
}
}
}
}
}
}}
])
Using $unwind
causes the documents to de-normalize and effectively creates a copy per array entry. Where possible you should avoid this due the the often extreme cost. Filtering and counting array entries per document is much faster by comparison. As is a simple $match
and $group
pipeline compared to many stages.
You can do so by using aggregation:
db.objects.aggregate([
{$match: {"createddate": {$gte : ISODate("2015-08-30T00:00:00.000Z")}, {"activity.action" : "test_action"}}},
{$unwind: "$activity"},
{$match: {"activity.action" : "test_action"}}},
{$group: {_id: null, count: {$sum: 1}}}
])
This will produce a result like:
{
count: 4
}