Bulk update in Pymongo using multiple ObjectId

早过忘川 提交于 2019-11-28 01:13:25

问题


I want to update thousands of documents in mongo collection. I want to find them using ObjectId and then whichever document matches , should be updated. My update is same for all documents. I have list of ObjectId. For every ObjectId in list, mongo should find matching document and update "isBad" key of that document to "N"

ids = [ObjectId('56ac9d3fa722f1029b75b128'), ObjectId('56ac8961a722f10249ad0ad1')]
bulk = db.testdata.initialize_unordered_bulk_op()
bulk.find( { '_id': ids} ).update( { '$set': {  "isBad" : "N" } } )
print bulk.execute()

This gives me result :

{'nModified': 0, 'nUpserted': 0, 'nMatched': 0, 'writeErrors': [], 'upserted': [], 'writeConcernErrors': [], 'nRemoved': 0, 'nInserted': 0}

This is expected because it is trying to match "_id" with list. But I don't know how to proceed.

I know how to update every document individually. My list size is of the order of 25000. I do not want to make 25000 calls individually. Number of documents in my collection are much more. I am using python2, pymongo = 3.2.1.


回答1:


Iterate through the id list using a for loop and send the bulk updates in batches of 500:

bulk = db.testdata.initialize_unordered_bulk_op()
counter = 0

for id in ids:
    # process in bulk
    bulk.find({ '_id': id }).update({ '$set': { 'isBad': 'N' } })
    counter += 1

    if (counter % 500 == 0):
        bulk.execute()
        bulk = db.testdata.initialize_ordered_bulk_op()

if (counter % 500 != 0):
    bulk.execute()

Because write commands can accept no more than 1000 operations (from the docs), you will have to split bulk operations into multiple batches, in this case you can choose an arbitrary batch size of up to 1000.

The reason for choosing 500 is to ensure that the sum of the associated document from the Bulk.find() and the update document is less than or equal to the maximum BSON document size even though there is no there is no guarantee using the default 1000 operations requests will fit under the 16MB BSON limit. The Bulk() operations in the mongo shell and comparable methods in the drivers do not have this limit.




回答2:


I got the answer, It can be done like this :

    bulk = db.testdata.initialize_unordered_bulk_op()
    for i in range (0, len(ids)):
        bulk.find( { '_id':  ids[i]}).update({ '$set': {  "isBad" : "N" }})
    print bulk.execute()


来源:https://stackoverflow.com/questions/35480660/bulk-update-in-pymongo-using-multiple-objectid

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!