Pandas/Python: Set value of one column based on value in another column

后端 未结 8 2289
醉酒成梦
醉酒成梦 2020-11-28 22:57

I need to set the value of one column based on the value of another in a Pandas dataframe. This is the logic:

if df[\'c1\'] == \'Value\':
    df[\'c2\'] = 10         


        
相关标签:
8条回答
  • 2020-11-28 23:36

    try:

    df['c2'] = df['c1'].apply(lambda x: 10 if x == 'Value' else x)

    0 讨论(0)
  • 2020-11-28 23:43

    one way to do this would be to use indexing with .loc.

    Example

    In the absence of an example dataframe, I'll make one up here:

    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame({'c1': list('abcdefg')})
    df.loc[5, 'c1'] = 'Value'
    
    >>> df
          c1
    0      a
    1      b
    2      c
    3      d
    4      e
    5  Value
    6      g
    

    Assuming you wanted to create a new column c2, equivalent to c1 except where c1 is Value, in which case, you would like to assign it to 10:

    First, you could create a new column c2, and set it to equivalent as c1, using one of the following two lines (they essentially do the same thing):

    df = df.assign(c2 = df['c1'])
    # OR:
    df['c2'] = df['c1']
    

    Then, find all the indices where c1 is equal to 'Value' using .loc, and assign your desired value in c2 at those indices:

    df.loc[df['c1'] == 'Value', 'c2'] = 10
    

    And you end up with this:

    >>> df
          c1  c2
    0      a   a
    1      b   b
    2      c   c
    3      d   d
    4      e   e
    5  Value  10
    6      g   g
    

    If, as you suggested in your question, you would perhaps sometimes just want to replace the values in the column you already have, rather than create a new column, then just skip the column creation, and do the following:

    df['c1'].loc[df['c1'] == 'Value'] = 10
    # or:
    df.loc[df['c1'] == 'Value', 'c1'] = 10
    

    Giving you:

    >>> df
          c1
    0      a
    1      b
    2      c
    3      d
    4      e
    5     10
    6      g
    
    0 讨论(0)
  • 2020-11-28 23:46

    I suggest doing it in two steps:

    # set fixed value to 'c2' where the condition is met
    df.loc[df['c1'] == 'Value', 'c2'] = 10
    
    # copy value from 'c3' to 'c2' where the condition is NOT met
    df.loc[df['c1'] != 'Value', 'c2'] = df[df['c1'] != 'Value', 'c3']
    
    0 讨论(0)
  • 2020-11-28 23:46

    I had a big dataset and .loc[] was taking too long so I found a vectorized way to do it. Recall that you can set a column to a logical operator, so this works:

    file['Flag'] = (file['Claim_Amount'] > 0)

    This gives a Boolean, which I wanted, but you can multiply it by, say, 1 to make an Integer.

    0 讨论(0)
  • 2020-11-28 23:48

    You can use pandas.DataFrame.mask to add virtually as many conditions as you need:

    data = {'a': [1,2,3,4,5], 'b': [6,8,9,10,11]}
    
    d = pd.DataFrame.from_dict(data, orient='columns')
    c = {'c1': (2, 'Value1'), 'c2': (3, 'Value2'), 'c3': (5, d['b'])}
    
    d['new'] = np.nan
    for value in c.values():
        d['new'].mask(d['a'] == value[0], value[1], inplace=True)
    
    d['new'] = d['new'].fillna('Else')
    d
    

    Output:

        a   b   new
    0   1   6   Else
    1   2   8   Value1
    2   3   9   Value2
    3   4   10  Else
    4   5   11  11
    
    0 讨论(0)
  • 2020-11-28 23:52

    Try out df.apply() if you've a small/medium dataframe,

    df['c2'] = df.apply(lambda x: 10 if x['c1'] == 'Value' else x['c1'], axis = 1)
    

    Else, follow the slicing techniques mentioned in the above comments if you've got a big dataframe.

    0 讨论(0)
提交回复
热议问题