pymongo | 易学教程

How to add a custom type to dill's pickleable types

阅读更多关于 How to add a custom type to dill's pickleable types

问题 I'm trying to serialize some code I did not write and cannot modify that needs to be pickled/dilled. The script contains a mongodb collection object---it isn't actually used later, but dilling it is throwing an error. When I try dilling it, I receive the error: Collection object is not callable. If you meant to call __getnewargs__ method on a 'Database' object it is failing because no such method exists. I see code here that is enumerating the accepted types: https://github.com/uqfoundation

Iterating through PyMongo cursor throws InvalidBSON: year is out of range

阅读更多关于 Iterating through PyMongo cursor throws InvalidBSON: year is out of range

问题 I am using PyMongo to simply iterate over a Mongo collection, but I'm struggling with handling large Mongodb date objects. For example, if I have some data in a collection that looks like this: "bad_data" : [ { "id" : "id01", "label" : "bad_data", "value" : "exist", "type" : "String", "lastModified" : ISODate("2018-06-01T10:04:35.000Z"), "expires" : Date(9223372036854775000) } ] I will do something like: from pymongo import MongoClient, database, cursor, collection client = MongoClient(

Iterating through PyMongo cursor throws InvalidBSON: year is out of range

阅读更多关于 Iterating through PyMongo cursor throws InvalidBSON: year is out of range

pymongo - how to match on lookup?

阅读更多关于 pymongo - how to match on lookup?

问题 I have two collections, a model and a papers collection. I need to be able to match fields from both of them. They have a field in common called reference which contains an identifier. I want to match documents that have the following 'authors' : 'Migliore M' from the papers collection 'celltypes' : 'Hippocampus CA3 pyramidal cell' from the models collection Here is what my code looks like: pipeline = [{'$lookup': {'from' : 'models', 'localField' : 'references', 'foreignField' : 'references',

How can I get pymongo to always return str and not unicode?

阅读更多关于 How can I get pymongo to always return str and not unicode?

问题 From the pymongo docs: MongoDB stores data in BSON format. BSON strings are UTF-8 encoded so PyMongo must ensure that any strings it stores contain only valid UTF-8 data. Regular strings () are > validated and stored unaltered. Unicode strings () are encoded UTF-8 first. > The reason our example string is represented in the Python shell as u’Mike’ instead of ‘Mike’ is that PyMongo decodes each BSON string to a Python unicode string, not a regular str." It seems a bit silly to me that the

python - sort mongodb by the value of one key

阅读更多关于 python - sort mongodb by the value of one key

问题 I have a collection with below data structure: [{name: "123", category: "A"}, {name: "456", category: "B"}, {name: "789", category: "A"}, {name: "101", category: "C"}] I want to be able to sort them according to the value of category , by specifying which comes first. For example, sorting the query in the order of B->C->A, the result would yield: [{name: "456", category: "B"}, {name: "101", category: "C"}, {name: "123", category: "A"}, {name: "789", category: "A"}] Is there any good way of

Pymongo significantly slower than mongo shell?

阅读更多关于 Pymongo significantly slower than mongo shell?

问题 I'm relatively new to mongodb, and having a performance problem in pymongo. I have a collection that's 50 GBs (uncompressed) 20 GBs (compressed via WiredTiger) with about 39 million documents. Querying it over indexed fields gives a result that's about 125,000 documents and 150 MBs uncompressed. When I do the following in the mongo shell, it takes about a second. var result = db.my_collection.find(my_query).toArray() However, when I do the same thing in pymongo, it takes over 7 seconds. db =

Mongodb bulk insert limit in Python

阅读更多关于 Mongodb bulk insert limit in Python

问题 Is there a limit to the number of documents one can bulk insert with PyMongo? And I don't mean the 16mb limit of document size for MongoDB, but the actual size of the list of documents I wish to insert in bulk through Python. 回答1: There is no limit on the number of documents for bulk insert via pymongo. According to the docs, you can provide an iterable to the collection.insert , and it will insert each document in the iterable, sending only a single command to the server Key point here is

Update field type in mongo

阅读更多关于 Update field type in mongo

问题 I have a huge number of records in a collection : {field: [value]} How can I efficiently update to: {field: value} I've tried something like this: (pymongo syntax) collection.update({"field.1": {"$exists": True}}, {"$set": {'field': "field.1"}}, multi=True) which does not work apparently. Running through each record in a loop and removing-inserting is not an option because of the large number of records. 回答1: You need to loop over the cursor and update each document using the $set update

PyMongo - Create MongoClient with connect=False, or create client after forking

阅读更多关于 PyMongo - Create MongoClient with connect=False, or create client after forking

问题 I'm developing web app in flask with mongodb (mLab). After deploying it for heroku I have such error: userWarning: MongoClient opened before fork. Create MongoClient with connect=False, or create client after forking. I found this documentation but have no idea how to use it with my code. http://api.mongodb.com/python/current/faq.html#using-pymongo-with-multiprocessing Here is part of my code. Can anyone show me how to create MongoClient with connect=False, or create client after forking?