How can I load data from mongodb collection into pandas' DataFrame?

后端 未结 4 668
独厮守ぢ
独厮守ぢ 2020-12-07 17:56

I am new to pandas (well, to all things \"programming\"...), but have been encouraged to give it a try. I have a mongodb database - \"test\" - with a collection called \"tw

相关标签:
4条回答
  • 2020-12-07 18:25

    Comprehend the cursor you got from the MongoDB before passing it to DataFrame

    import pandas as pd
    df = pd.DataFrame(list(tweets.find()))
    
    0 讨论(0)
  • 2020-12-07 18:30

    If you have data in MongoDb like this:

    [
        {
            "name": "Adam", 
            "age": 27, 
            "address":{
                "number": 4, 
                "street": "Main Road", 
                "city": "Oxford"
            }
         },
         {
            "name": "Steve", 
            "age": 32, 
            "address":{
                "number": 78, 
                "street": "High Street", 
                "city": "Cambridge"
            }
         }
    ]
    

    You can put the data straight into a dataframe like this:

    from pandas import DataFrame
    
    df = DataFrame(list(db.collection_name.find({}))
    

    And you will get this output:

    df.head()
    
    |    | name    | age  | address                                                   |
    |----|---------|------|-----------------------------------------------------------|
    | 1  | "Steve" | 27   | {"number": 4, "street": "Main Road", "city": "Oxford"}    | 
    | 2  | "Adam"  | 32   | {"number": 78, "street": "High St", "city": "Cambridge"}  |
    

    However the subdocuments will just appear as JSON inside the subdocument cell. If you want to flatten objects so that subdocument properties are shown as individual cells you can use json_normalize without any parameters.

    from pandas.io.json import json_normalize
    
    datapoints = list(db.collection_name.find({})
    
    df = json_normalize(datapoints)
    
    df.head()
    

    This will give the dataframe in this format:

    |    | name   | age  | address.number | address.street | address.city |
    |----|--------|------|----------------|----------------|--------------|
    | 1  | Thomas | 27   |     4          | "Main Road"    | "Oxford"     |
    | 2  | Mary   | 32   |     78         | "High St"      | "Cambridge"  |
    
    0 讨论(0)
  • 2020-12-07 18:30

    You can load your MongoDB data to pandas DataFame using this code. It works for me. Hope for you too.

    import pymongo
    import pandas as pd
    from pymongo import Connection
    connection = Connection()
    db = connection.database_name
    input_data = db.collection_name
    data = pd.DataFrame(list(input_data.find()))
    
    0 讨论(0)
  • 2020-12-07 18:30

    Use: df=pd.DataFrame.from_dict(collection)

    0 讨论(0)
提交回复
热议问题