mongodb multikey index seems like not being used when sort

那年仲夏 提交于 2019-12-11 19:13:31

问题


Let's assume that I have tx_collection which has 3 documents like below

{
    "block_number": 1,
    "value": 122
    "transfers": [
        {
            "from": "foo1", 
            "to": "bar1", 
            "amount": 111
        },
        {
            "from": "foo3", 
            "to": "bar3", 
            "amount": 11
        },
    ]
},
{
    "block_number": 2,
    "value": 88
    "transfers": [
        {
            "from": "foo11", 
            "to": "bar11", 
            "amount": 33
        },
        {
            "from": "foo22", 
            "to": "bar22", 
            "amount": 55
        },
    ]
},
{
    "block_number": 3,
    "value": 233
    "transfers": [
        {
            "from": "foo1", 
            "to": "bar1", 
            "amount": 33
        },
        {
            "from": "foo3", 
            "to": "bar3", 
            "amount": 200
        },
    ]
}

For the performance issue, I create multikey index on transfers.amount

When I sort by transfers.amount,

db.getCollection('tx_transaction').find({}).sort({"transfers.amount":-1})

what I expected order of documents is sorted by max value of subfield transfers.amount like

{
    "block_number": 3,
    "value": 233
    "transfers": [
        {
            "from": "foo1", 
            "to": "bar1", 
            "amount": 33
        },
        {
            "from": "foo3", 
            "to": "bar3", 
            "amount": 200
        },
    ]
},
{
    "block_number": 1,
    "value": 122
    "transfers": [
        {
            "from": "foo1", 
            "to": "bar1", 
            "amount": 111
        },
        {
            "from": "foo3", 
            "to": "bar3", 
            "amount": 11
        },
    ]
},
{
    "block_number": 2,
    "value": 88
    "transfers": [
        {
            "from": "foo11", 
            "to": "bar11", 
            "amount": 33
        },
        {
            "from": "foo22", 
            "to": "bar22", 
            "amount": 55
        },
    ]
}

The sort works well since there are only 3 documents. Sorted order is block number 3 -> block number 1 -> block_number 2 which I expected

My issue is that when there is 19 million documents, it throws error message

The massage is like

"errmsg" : "Executor error during find command: OperationFailed: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.",

It seems that multikey index is not used when sort.

do you have any idea why this error message is thrown?

JFYI.

  • My mongodb version is 3.6.3
  • tx_collection is sharded

回答1:


As of MongoDB 3.6 and newer, I think this is to be expected as mentioned in Use Indexes to Sort Query Results where it stated:

As a result of changes to sorting behavior on array fields in MongoDB 3.6, when sorting on an array indexed with a multikey index the query plan includes a blocking SORT stage. The new sorting behavior may negatively impact performance.

In a blocking SORT, all input must be consumed by the sort step before it can produce output. In a non-blocking, or indexed sort, the sort step scans the index to produce results in the requested order.

In other words, "blocking sort" means the presence of the SORT_KEY_GENERATOR stage, the stage that means in-memory sort. This was changed from pre-3.6 MongoDB due to SERVER-19402 to address the inconsistencies around sorting an array field.

There is a ticket to improve this situation: SERVER-31898. Unfortunately there is no workaround for this behaviour just yet.



来源:https://stackoverflow.com/questions/58314180/mongodb-multikey-index-seems-like-not-being-used-when-sort

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!