Pandas: Conditionally replace values based on other columns values

前端 未结 4 711
别那么骄傲
别那么骄傲 2021-02-06 11:52

I have a dataframe (df) that looks like this:

                    environment     event   
time                    
2017-04-28 13:08:22     NaN         add_rd  
         


        
相关标签:
4条回答
  • 2021-02-06 12:04

    Now my goal is for each add_rd in the event column, the associated NaN-value in the environment column should be replaced with a string RD.

    As per @Zero's comment, use pd.DataFrame.loc and Boolean indexing:

    df.loc[df['event'].eq('add_rd') & df['environment'].isnull(), 'environment'] = 'RD'
    
    0 讨论(0)
  • 2021-02-06 12:04

    You could consider using where:

    df.environment.where((~df.environment.isnull()) & (df.event != 'add_rd'),
                         'RD', inplace=True)
    

    If the condition is not met, the values is replaced by the second element.

    0 讨论(0)
  • 2021-02-06 12:19

    if you want to replace just 'add_rd' with 'RD', this can be useful to you

    keys_to_replace = {'add_rd':'RD','add_env':'simple'}
    df['environment'] = df.groupby(['event'])['environment'].fillna(keys_to_replace['add_rd'])
    df
    

    output:

        environment event
    0   RD          add_rd
    1   RD          add_rd
    2   test        add_env
    3   prod        add_env
    

    if you have many values to replace based on event, then you may need to follow groupby with 'event' column values

    keys_to_replace = {'add_rd':'RD','add_env':'simple'}
    temp = df.groupby(['event']).apply(lambda x:  x['environment'].fillna(keys_to_replace[x['event'].values[0]]))
    temp.index = temp.index.droplevel(0)
    df['environment'] = temp.sort_index().values
    

    output:

       environment  event
    0   RD          add_rd
    1   RD          add_rd
    2   test        add_env
    3   prod        add_env
    
    0 讨论(0)
  • 2021-02-06 12:22

    Here it is:

     df['environment']=df['environment'].fillna('RD')
    
    0 讨论(0)
提交回复
热议问题