Mongo aggregate query results in less document with sorting

随声附和 提交于 2020-05-16 22:40:46

问题


I have faced a weird issue while querying one of our collections and aggregating the result out to another collection. I am querying unique users who have done some action and projecting the count of action performed per user in the aggregation query.

var result = db.getCollection('user_actions').aggregate(
[
  {"$match":{"createdAt":{"$gte":1585161000000,"$lt":1585247340000}}},
  {"$group":{"_id":{"accountId":"$user.id"},"count":{"$sum":1}}},
  {"$sort": {"count": -1}},
  {"$project":{"_id":0,"count":"$count","accountId":"$_id.accountId"}}
]
)
db.getCollection('winners').insert(result.toArray())
db.getCollection('winners').find({}).count()

The result of the above query inserts 12643 documents in the newly created collection.

I have also tried the query with allowDiskUse and cursor initial batch size value as well. But the stats of the collection as an output of the aggregation pipeline remained the same and neither am I getting any exceptions.

{
    "ns" : "data.winners",
    "size" : 1121979,
    "count" : 12639,
    "avgObjSize" : 88,
    "storageSize" : 4096,
    "capped" : false,
    "wiredTiger" : {...},
    "nindexes" : 1,
    "totalIndexSize" : 4096,
    "indexSizes" : {
        "_id_" : 4096
    },
    "ok" : 1,
    "operationTime" : Timestamp(1585321431, 41),
    "$gleStats" : {...},
    "lastCommittedOpTime" : Timestamp(1585321431, 41),
    "$configServerState" : {...},
    "$clusterTime" : {...   }
}

That was more of a plot setting, now, when I remove the sort query from the aggregation, it produces 26397 documents as a result. I could pick up some individual accounts from this result and verify that this is the correct result I should have got in the first place(sorted additionally). I have tried looking into MongoDB Limits and Thresholds but couldn't really find a relation to the observation.

Database version

db.version()
4.0.0

Shell details

mongo -version
MongoDB shell version v4.2.1
git version: edf6d45851c0b9ee15548f0f847df141764a317e
allocator: system
modules: none
build environment:
    distarch: x86_64
    target_arch: x86_64

Edit: Worth mentioning as discussed with Valijon in comments, using another stage in the aggregation pipeline to count the documents processed:

{"$count": "winners"}

results in the same value 26397 with and without the {"$sort": {"count": -1}} stage.


It doesn't look to me like a sort memory limit behaviour either since that would have thrown an error according to the documentation.

来源:https://stackoverflow.com/questions/60897584/mongo-aggregate-query-results-in-less-document-with-sorting

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!