Pymongo: iterate over all documents in the collection

后端 未结 4 1297
孤城傲影
孤城傲影 2021-02-13 18:26

I am using PyMongo and trying to iterate over (10 millions) documents in my MongoDB collection and just extract a couple of keys: \"name\" and \"address\", then output them to .

相关标签:
4条回答
  • 2021-02-13 18:35

    I had no luck with .find().forEach() either, but this should find what you are searching for and then print it.

    First find all documents that match what you are searching for

    cursors = db.myCollection.find({"name": {$regex: REGEX}})
    

    then iterate it over the matches

    for cursor in cursors
        print(cursor.get("name"))
    
    0 讨论(0)
  • 2021-02-13 18:38

    I cannot figure out the right syntax to do it with find().forEach()

    cursor.forEach() is not available for Python, it's a JavaScript function. You would have to get a cursor and iterate over it. See PyMongo Tutorial: querying for more than one document, where you can do :

    for document in myCollection.find():
        print(document) # iterate the cursor
    

    where REGEX would match everything - and it resulted in "Killed".

    Unfortunately there's lack of information here to debug on why and what 'Killed' is. Although if you would like to match everything, you can just state:

    cursor = db.myCollection.find({"name": {$regex: /.*/}}) 
    

    Given that field name contains string values. Although using $exists to check whether field name exists would be preferable than using regex.

    While the use of $exists operator in your example above is incorrect. You're missing an s in $exists. Again, unfortunately we don't know much information on what 'didn't work' meant to help debug further.

    If you're writing this script for Python exercise, I would recommend to review:

    • PyMongo Tutorial
    • MongoDB Tutorial: query documents

    You could also enrol in a free online course at MongoDB University for M101P: MongoDB for Python Developers.

    However, if you are just trying to accomplish your task of exporting CSV from a collection. As an alternative you could just use MongoDB's mongoexport. Which has the support for :

    • Exporting specific fields via --fields "name,address"
    • Exporting in CSV via --type "csv"
    • Exporting specific values with query via --query "..."

    See mongoexport usage for more information.

    0 讨论(0)
  • 2021-02-13 18:48

    I think I get the question but there's no accurate answer yet I believe. I had the same challenge and that's how I came about this, although, I don't know how to output to a .csv file. For my situation I needed the result in JSON. Here's my solution to your question using mongodb Projections;

    your_collection = db.myCollection
    cursor = list(your_collection.find( { }, {"name": 1, "address": 1}))
    

    This second line returns the result as a list using the python list() function.

    And then you can use jsonify(cursor) or just print(cursor) as a list.

    I believe with the list it should be easier to figure how to output to a .csv.

    0 讨论(0)
  • 2021-02-13 18:51

    The find() methods returns a PyMongo cursor, which is a reference to the result set of a query.

    You have to de-reference, somehow, the reference(address).

    After that, you will get a better understanding how to manipulate/manage the cursor.

    Try the following for a start:

    result = db.*collection_name*.find()
    print(list(result)) 
    
    0 讨论(0)
提交回复
热议问题