PyMongo- selecting sub-documents from collection by regex

笑着哭i 提交于 2019-12-13 06:17:05

问题


Lets take for example the following collections:

{
    '_id': '0',
    'docs': [
        {'value': 'abcd', 'key': '1234'},
        {'value': 'abef', 'key': '5678'}
    ]
}
{
    '_id': '1',
    'docs': [
        {'value': 'wxyz', 'key': '1234'},
        {'value': 'abgh', 'key': '5678'}
    ]
}

I want to be able to select only the sub-documents under the 'docs' list which 'value' contains the string 'ab'. What I'm expecting to get is the following collections:

{
    '_id': '0',
    'docs': [
        {'value': 'abcd', 'key': '1234'},
        {'value': 'abef', 'key': '5678'}
    ]
}
{
    '_id': '1',
    'docs': [
        {'value': 'abgh', 'key': '5678'}
    ]
}

Thus, filtering out the unmatched sub-documents.


回答1:


You need an aggregation pipeline that matches each subdocument separately, then re-joins the matching subdocuments into arrays:

from pprint import pprint
from bson import Regex

regex = Regex(r'ab')
pprint(list(col.aggregate([{
    '$unwind': '$docs'
}, {
    '$match': {'docs.value': regex}
}, {
    '$group': {
        '_id': '$_id',
        'docs': {'$push': '$docs'}
    }
}])))

I assume "col" is a variable pointing to your PyMongo Collection object. This outputs:

[{u'_id': u'1', 
  u'docs': [{u'key': u'5678', u'value': u'abgh'}]},
 {u'_id': u'0',
  u'docs': [{u'key': u'1234', u'value': u'abcd'},
            {u'key': u'5678', u'value': u'abef'}]}]

The "r" prefix to the string makes it a Python "raw" string to avoid any trouble with regex code. In this case the regex is just "ab" so the "r" prefix isn't necessary, but it's good practice now so you don't make a mistake in the future.



来源:https://stackoverflow.com/questions/40331182/pymongo-selecting-sub-documents-from-collection-by-regex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!