Pandas update multiple columns at once

后端 未结 2 644
青春惊慌失措
青春惊慌失措 2021-02-05 04:55

I\'m trying to update a couple fields at once - I have two data sources and I\'m trying to reconcile them. I know I could do some ugly merging and then delete columns, but was

2条回答
  •  醉梦人生
    2021-02-05 05:08

    In the "take the hill" spirit, I offer the below solution which yields the requested result.

    I realize this is not exactly what you are after as I am not slicing the df (in the reasonable - but non functional - way in which you propose).

    #Does not work when indexing on np.nan, so I fill with some arbitrary value. 
    df = df.fillna('AAA')
    
    #mask to determine which rows to update
    mask = df['Col1'] == 'AAA'
    
    #dict with key value pairs for columns to be updated
    mp = {'Col1':'col1_v2','Col2':'col2_v2','Col3':'col3_v2'}
    
    #update
    for k in mp: 
         df.loc[mask,k] = df[mp.get(k)]
    
    #swap back np.nans for the arbitrary values
    df = df.replace('AAA',np.nan)
    

    Output:

    Col1    Col2    Col3    col1_v2     col2_v2     col3_v2
    A       B       C       NaN         NaN         NaN
    D       E       F       NaN         NaN         NaN
    a       b       d       a           b           d
    d       e       f       d           e           f
    

    The error I get if I do not replace nans is below. I'm going to research exactly where that error stems from.

    ValueError: array is not broadcastable to correct shape
    

提交回复
热议问题