(How) can aggregate() break an index?

拜拜、爱过 提交于 2019-12-13 03:58:46

问题


(How) can this code break an index on a MongoDB collection?

db.users.aggregate([
   { $group: { _id: null, total_orders: { $sum: { $size: "$orders" } } } },
])

This is (obviously) trying to count all the orders of all the users.

I have inherited a project with this code commented out. I need to uncomment (re-enable) the code, but would like to understand what I risk by doing so. The author of this change is unavailable, and their commit message for the commenting out is "remove statistics calls - breaking index."

The MongoDB we're using is via mlab.com, and hosted on AWS. We're connecting to mlab from a JS app using mongoose.


回答1:


I suspect that the issue wasn't that the aggregation would break an index, but instead that the aggregation did not use indexes and would perform a collection scan.

Aggregations can take advantage of indexes when there are $match and/or $sort stages placed at the beginning of a pipeline. This aggregation is only a single $group stage, meaning the entire collection would need to be iterated through to compute the count.

I put a simple example below showing the aggregation performing a collection scan, even when the array field is indexed.

> db.foo.insert({ "x" : [ 1, 2 ] } )
> db.foo.insert({ "x" : [ 1 ] } )
> db.foo.createIndex({ "x" : 1 } )
...

> db.foo.aggregate([ { $group: { _id: null, cnt: { $sum : { $size: "$x" } } } } ] )
{ "_id" : null, "cnt" : 3 }

// Results of a .explain() - see 'winningPlan' below
> db.foo.explain(true).aggregate([ { $group: { _id: null, cnt: { $sum : { $size: "$x" } } } } ] )
{
    "stages" : [
        {
            "$cursor" : {
                "query" : {

                },
                "fields" : {
                    "x" : 1,
                    "_id" : 0
                },
                "queryPlanner" : {
                    "plannerVersion" : 1,
                    "namespace" : "stack.foo",
                    "indexFilterSet" : false,
                    "parsedQuery" : {

                    },
                    "winningPlan" : {
                        "stage" : "COLLSCAN",
                        "direction" : "forward"
                    },
                    "rejectedPlans" : [ ]
                },
                "executionStats" : {
                    "executionSuccess" : true,
                    "nReturned" : 2,
                    "executionTimeMillis" : 0,
                    "totalKeysExamined" : 0,
                    "totalDocsExamined" : 2,
                    "executionStages" : {
                        "stage" : "COLLSCAN",
                        "nReturned" : 2,
                        "executionTimeMillisEstimate" : 0,
                        "works" : 4,
                        "advanced" : 2,
                        "needTime" : 1,
                        "needYield" : 0,
                        "saveState" : 1,
                        "restoreState" : 1,
                        "isEOF" : 1,
                        "invalidates" : 0,
                        "direction" : "forward",
                        "docsExamined" : 2
                    },
                    "allPlansExecution" : [ ]
                }
            }
        },
        {
            "$group" : {
                "_id" : {
                    "$const" : null
                },
                "cnt" : {
                    "$sum" : {
                        "$size" : [
                            "$x"
                        ]
                    }
                }
            }
        }
    ],
    "ok" : 1,
    ...
}


来源:https://stackoverflow.com/questions/57927912/how-can-aggregate-break-an-index

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!