MongoDB not using my index

心已入冬 提交于 2019-12-11 17:37:01

问题


I have a logs collection with millions of records. Creating a new index takes "forever". So it would be preferred to use existing indexes.

Now I want to get the number of occurances of certain error codes. I use this query, and functionally it works fine:

db.getCollection('logs.res').aggregate([
    {
       $match:{    
           timeStamp: {
               $gte: new Date('2017-05-01').getTime(), // timeStamp is Number
               $lt : new Date('2017-05-02').getTime()  // of ms since epoch
           },
           'objData.@.ErrorCode': {
               $ne: null
           }
        }
    },
    {
        $group: {
            _id: '$objData.@.ErrorCode',
            count: {$sum: 1}
        }
    },
    {
        $sort: { count: -1}
    }
]);

The problem is that it takes well near 10 seconds just to execute this for a day. I had assumed the following index would be used: timeStamp_-1_objData.@.ErrorCode_1:

{
    "timeStamp" : -1,
    "objData.@.ErrorCode" : 1
}

However, MongoDB seems adamant to use some timeStamp: 1 index (with some other indexes unrelated to the query), and scan alllllll the results to see if some responses might have an ErrorCode attached, even though this information should be in the index.

Here is the explain():

  • Is there a way to use the timeStamp_-1_objData.@.ErrorCode_1 index to speed this up?
  • Why isn't it using this index? I'm probably misunderstanding how indexes are used in this query.

Running MongoDB 3.2.7 on OSX.

note: I've also tried $empty: true in stead of $ne: null. It yields the same results, but some say you cannot use $empty if you want to use a compound index. Many questions on Stack Overflow are old (mongo 2.x) though.


回答1:


Winning plan is CACHED PLAN. You can try clearing the cache plan.

db.getCollection('logs.res').getPlanCache().clear()

If after you clean the cache, Mongo is still using the wrong index. You can try setting the query plan or use "hint" to force your index




回答2:


Regular mongodb indexes use both field value and type to build the tree.

Queries like $empty: true or $ne: null don't have a parameter of any type and cannot benefit from such indexes. It is a special case, and requires a special sparse index.

If your timeStamp_-1_objData.@.ErrorCode_1 index is created as:

db.getCollection('logs.res').createIndex(
    {
        "timeStamp" : -1,
        "objData.@.ErrorCode" : 1
    },
    { sparse: true }
)

It should support your query best. Otherwise there is no much difference between timeStamp_-1_objData.@.ErrorCode_1 and timeStamp_1_module_1_etc since the only first field is being used.



来源:https://stackoverflow.com/questions/46326674/mongodb-not-using-my-index

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!