问题
Im working with a remote mongodb database in my python code.The code accessing the database and the database itself are on two different machines. The pymongo module version im using is 1.9+. The script consists of the following code:
for s in coll.find({ "somefield.a_date" : { "$exists":False },
"somefield.b_date" : { "$exists":False }}):
original = s['details']['c_date']
utc = from_tz.localize(original).astimezone(pytz.utc)
s['details']['c_date'] = utc
if str(type(s['somefield'])) != "<type 'dict'>":
s['somefield'] = {}
s['somefield']['b_date'] = datetime.utcnow()
coll.update({ '_id' : s['_id'] }, s );
After running this code, a strange thing happened. There were millions of records in the collection initially and after running the script,just 29% of the total records remained, the rest were automatically deleted. Is there any known issue with PyMongo driver version 1.9+ ? What could have been other reasons for this and any ways i can find out what exactly happened ?
回答1:
What could have been other reasons for this and any ways i can find out what exactly happened ?
The first thing to check is "were there any exceptions"?
In coll.update()
, you are not setting the safe
variable. If there is an exception on the update
, it will not be thrown.
In your code you not do not catch exceptions (which is suggested) and your update does not check for exceptions, so you have no way of knowing what's going on.
The second thing to check is "how are you counting"?
The update
command can "blank out" data, but it cannot delete data (or change an _id
).
Do you have a copy of the original data? Can you run your code on a small number of those 10 or 100 and see what's happening?
What you describe is not normal with any of the MongoDB drivers. We definitely need more data to resolve this issue.
来源:https://stackoverflow.com/questions/7083291/updating-records-in-mongodb-through-pymongo-leads-to-deletion-of-most-of-them