Update field type in mongo

此生再无相见时 提交于 2019-12-22 08:12:41

问题


I have a huge number of records in a collection:

{field: [value]}

How can I efficiently update to:

{field: value}

I've tried something like this: (pymongo syntax)

collection.update({"field.1": {"$exists": True}},
                  {"$set": {'field': "field.1"}},
                  multi=True)

which does not work apparently. Running through each record in a loop and removing-inserting is not an option because of the large number of records.


回答1:


You need to loop over the cursor and update each document using the $set update operator. Of course to do this you use "bulk" operations for maximum efficiency. That being said the approach will differ depending on your MongoDB server version and your PyMongo version.

From MongoDB 3.2 you need to use Bulk Write Operations and the bulkWrite() method.

var requests = [];
var cursor = db.collection.find( { "field.1": { "$exists": true } }, { "field": 1 } );
cursor.forEach( document => { 
    requests.push({ 
        "updateOne": {
            "filter" : { "_id": document._id },
            "update" : { "field": { "$set": document.field[0] } }
        }
    });
    if (requests.length === 1000) {
        db.collection.bulkWrite(requests);
        requests = [];
    }
});

if (requests.length > 0) {
    db.collection.bulkWrite(requests);
}

This query using the PyMongo 3.0 driver which provides the you need to use the bulk_write() method gives the following:

from pymongo import UpdateOne


requests = [];
cursor = db.collection.find({"field.1": {"$exists": True}}, {"field": 1})
for document in cursor:
    requests.append(UpdateOne({'_id': document['_id']}, {'$set': {'field': document['field'][0]}}))
    if len(requests) == 1000:
        # Execute per 1000 operations
        db.collection.bulk_write(requests)
        requests = []
if len(requests) > 0:

    # clean up queues
    db.collection.bulk_write(requests)

From MongoDB 2.6 you need to use the now deprecated Bulk API.

var bulk = db.collection.initializeUnorderedBulkOp();
var count = 0;

// cursor is the same as in the previous version using MongoDB 3.2
cursor.forEach(function(document) { 
    bulk.find( { "_id": document._id } ).updateOne( { "$set": { "field": document.field[0] } } ); 
    count++;
    if (count % 1000 === 0) {
        bulk.execute();
        bulk = db.collection.initializedUnorderedBulkOp();
    }
});

// Again clean up queues
if (count > 0 ) {
    bulk.execute();
}

Translate into Python gives the following.

bulk = db.collection.initialize_unordered_bulk_op()
count = 0

for doc in cursor:
    bulk.find({'_id': doc['_id']}).update_one({'$set': {'field': doc['field'][0]}})
    count = count + 1
    if count == 1000:
        bulk.execute()
        bulk = db.collection.initialize_unordered_bulk_op()

if count > 0:
    bulk.execute()



回答2:


If your arrays only have one element then your update won't work because in JavaScript (which mongodb is heavily influenced by) the first array index is 0. This should work:

collection.update({"field.0": {"$exists": True}},
                  {"$set": {'field': "field.0"}},
                  multi=True)


来源:https://stackoverflow.com/questions/36429475/update-field-type-in-mongo

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!