Overwrite columns in DataFrames of different sizes pandas

前端 未结 3 959
梦如初夏
梦如初夏 2021-01-13 13:20

I have following two Data Frames:

df1 = pd.DataFrame({\'ids\':[1,2,3,4,5],\'cost\':[0,0,1,1,0]})
df2 = pd.DataFrame({\'ids\':[1,5],\'cost\':[1,4]})


        
相关标签:
3条回答
  • 2021-01-13 14:05

    You can use set_index and combine first to give precedence to values in df2

    df_result = df2.set_index('ids').combine_first(df1.set_index('ids'))
    df_result.reset_index()
    

    You get

       ids  cost
    0   1   1
    1   2   0
    2   3   1
    3   4   1
    4   5   4
    
    0 讨论(0)
  • 2021-01-13 14:09

    Another way to do it, using a temporary merged dataframe which you can discard after use.

    import pandas as pd
    
    df1 = pd.DataFrame({'ids':[1,2,3,4,5],'cost':[0,0,1,1,0]})
    df2 = pd.DataFrame({'ids':[1,5],'cost':[1,4]})
    
    dftemp = df1.merge(df2,on='ids',how='left', suffixes=('','_r'))
    print(dftemp)
    
    df1.loc[~pd.isnull(dftemp.cost_r), 'cost'] = dftemp.loc[~pd.isnull(dftemp.cost_r), 'cost_r']
    del dftemp 
    
    df1 = df1[['ids','cost']]
    print(df1)
    
    
    OUTPUT-----:
    dftemp:
       cost  ids  cost_r
    0     0    1     1.0
    1     0    2     NaN
    2     1    3     NaN
    3     1    4     NaN
    4     0    5     4.0
    
    df1:
       ids  cost
    0    1   1.0
    1    2   0.0
    2    3   1.0
    3    4   1.0
    4    5   4.0
    
    0 讨论(0)
  • 2021-01-13 14:16

    You could do this with a left merge:

    merged = pd.merge(df1, df2, on='ids', how='left')
    merged['cost'] = merged.cost_x.where(merged.cost_y.isnull(), merged['cost_y'])
    result = merged[['ids','cost']]
    

    However you can avoid the need for the merge (and get better performance) if you set the ids as an index column; then pandas can use this to align the results for you:

    df1 = df1.set_index('ids')
    df2 = df2.set_index('ids')
    
    df1.cost.where(~df1.index.isin(df2.index), df2.cost)
    ids
    1    1.0
    2    0.0
    3    1.0
    4    1.0
    5    4.0
    Name: cost, dtype: float64
    
    0 讨论(0)
提交回复
热议问题