问题
I have a logs collection with millions of records. Creating a new index takes "forever". So it would be preferred to use existing indexes.
Now I want to get the number of occurances of certain error codes. I use this query, and functionally it works fine:
db.getCollection('logs.res').aggregate([
{
$match:{
timeStamp: {
$gte: new Date('2017-05-01').getTime(), // timeStamp is Number
$lt : new Date('2017-05-02').getTime() // of ms since epoch
},
'objData.@.ErrorCode': {
$ne: null
}
}
},
{
$group: {
_id: '$objData.@.ErrorCode',
count: {$sum: 1}
}
},
{
$sort: { count: -1}
}
]);
The problem is that it takes well near 10 seconds just to execute this for a day. I had assumed the following index would be used: timeStamp_-1_objData.@.ErrorCode_1
:
{
"timeStamp" : -1,
"objData.@.ErrorCode" : 1
}
However, MongoDB seems adamant to use some timeStamp: 1
index (with some other indexes unrelated to the query), and scan alllllll the results to see if some responses might have an ErrorCode
attached, even though this information should be in the index.
Here is the explain()
:
- Is there a way to use the
timeStamp_-1_objData.@.ErrorCode_1
index to speed this up? - Why isn't it using this index? I'm probably misunderstanding how indexes are used in this query.
Running MongoDB 3.2.7 on OSX.
note: I've also tried $empty: true
in stead of $ne: null
. It yields the same results, but some say you cannot use $empty
if you want to use a compound index. Many questions on Stack Overflow are old (mongo 2.x) though.
回答1:
Winning plan is CACHED PLAN
.
You can try clearing the cache plan.
db.getCollection('logs.res').getPlanCache().clear()
If after you clean the cache, Mongo is still using the wrong index. You can try setting the query plan or use "hint" to force your index
回答2:
Regular mongodb indexes use both field value and type to build the tree.
Queries like $empty: true
or $ne: null
don't have a parameter of any type and cannot benefit from such indexes. It is a special case, and requires a special sparse index.
If your timeStamp_-1_objData.@.ErrorCode_1
index is created as:
db.getCollection('logs.res').createIndex(
{
"timeStamp" : -1,
"objData.@.ErrorCode" : 1
},
{ sparse: true }
)
It should support your query best. Otherwise there is no much difference between timeStamp_-1_objData.@.ErrorCode_1
and timeStamp_1_module_1_etc
since the only first field is being used.
来源:https://stackoverflow.com/questions/46326674/mongodb-not-using-my-index