python pandas - Editing multiple DataFrames with a for loop

前端 未结 6 738
误落风尘
误落风尘 2021-01-06 18:30

Considering the following 2 lists of 3 dicts and 3 empty DataFrames

dict0={\'actual\': {\'2013-02-20 13:30:00\': 0.93}}
dict1={\'actual\': {\'2013-02-20 13:3         


        
相关标签:
6条回答
  • 2021-01-06 19:01

    One liner.

    >>>df_list = [df.from_dict(dikt, orient='columns', dtype=None) for (df, dikt) in zip(dfs, dicts)]
    
    >>>df_list
    [                     actual
    2013-02-20 13:30:00    0.93,
                          actual
    2013-02-20 13:30:00    0.85, 
                          actual
    2013-02-20 13:30:00    0.98]
    
    >>>df_list[0]
                         actual
    2013-02-20 13:30:00    0.93
    
    0 讨论(0)
  • 2021-01-06 19:02

    You can also do this by putting the dataframes into a dictionary:

    dfs = {
        'df0': df0,
        'df1': df1,
        'df2': df2
    }
    

    And then calling and assigning the contents of the dictionary in the for loop.

    for dfname, dikt in zip(dfs.keys(), dicts):
        dfs[dfname] = dfs[dfname].from_dict(dikt, orient='columns', dtype=None)
    

    This is useful if you can still want to call the dataframes by their name (instead of an arbitrary index in a list...)

    dfs['df0']
    
    0 讨论(0)
  • 2021-01-06 19:05

    You need to keep the reference to the df objects, so you can try:

    for idx, dikt in enumerate(dicts):
        dfs[idx] = dfs[idx].from_dict(dikt, orient='columns', dtype=None)
    
    0 讨论(0)
  • 2021-01-06 19:14

    I don't have an explanation for why that is so. However a workaround is:

    dict0={'actual': {'2013-02-20 13:30:00': 0.93}}
    dict1={'actual': {'2013-02-20 13:30:00': 0.85}}
    dict2={'actual': {'2013-02-20 13:30:00': 0.98}}
    dicts=[dict0, dict1, dict2]
    
    dfs = []
    
    for dikt in dicts:
        df = df.from_dict(dikt, orient='columns', dtype=None)
        dfs.append(df)
    

    Now

    dfs[0]
    

    returns

                         actual
    2013-02-20 13:30:00    0.93
    
    0 讨论(0)
  • 2021-01-06 19:21

    This will get it done in place!!!
    Please note the 3 exclamations

    one liner

    [dfs[i].set_value(r, c, v)
     for i, dn in enumerate(dicts)
     for r, dr in dn.items()
     for c, v in dr.items()]; 
    

    somewhat more intuitive

    for d, df in zip(dicts, dfs):
        temp = pd.DataFrame(d).stack()
        for (r, c), v in temp.iteritems():
            df.set_value(r, c, v)
    
    df0
    
                         actual
    2013-02-20 13:30:00    0.93
    

    equivalent alternative
    without the pd.DataFrame construction

    for i, dn in enumerate(dicts):
        for r, dr in dn.items():
            for c, v in dr.items():
                dfs[i].set_value(r, c, v)
    

    Why is this different?
    All the other answers, so far, reassign a new dataframe to the requisite position in the list of dataframes. They clobber the dataframe that was there. The original dataframe is left empty while a new non-empty one rests in the list.

    This solution edits the dataframe in place ensuring the original dataframe is updated with new information.

    Per OP:

    However, when trying to retrieve for instance 1 of the df outside of the loop, it is still empty


    timing
    It's also considerably faster


    setup

    dict0={'actual': {'2013-02-20 13:30:00': 0.93}}
    dict1={'actual': {'2013-02-20 13:30:00': 0.85}}
    dict2={'actual': {'2013-02-20 13:30:00': 0.98}}
    dicts=[dict0, dict1, dict2]
    
    df0=pd.DataFrame()
    df1=pd.DataFrame()
    df2=pd.DataFrame()
    dfs=[df0, df1, df2]
    
    0 讨论(0)
  • 2021-01-06 19:24

    In your loop, df is just a temporary value, not a reference to the corresponding list element. If you want to modify the list while iterating it, you have to reference the list by index. You can do that using Python's enumerate:

    for i, (df, dikt) in enumerate(zip(dfs, dicts)):
        dfs[i] = df.from_dict(dikt, orient='columns', dtype=None)
    
    0 讨论(0)
提交回复
热议问题