How to apply a function to two columns of Pandas dataframe

前端 未结 12 1185
名媛妹妹
名媛妹妹 2020-11-22 06:17

Suppose I have a df which has columns of \'ID\', \'col_1\', \'col_2\'. And I define a function :

f = lambda x, y : my_function_expres

12条回答
  •  长情又很酷
    2020-11-22 06:51

    There is a clean, one-line way of doing this in Pandas:

    df['col_3'] = df.apply(lambda x: f(x.col_1, x.col_2), axis=1)
    

    This allows f to be a user-defined function with multiple input values, and uses (safe) column names rather than (unsafe) numeric indices to access the columns.

    Example with data (based on original question):

    import pandas as pd
    
    df = pd.DataFrame({'ID':['1', '2', '3'], 'col_1': [0, 2, 3], 'col_2':[1, 4, 5]})
    mylist = ['a', 'b', 'c', 'd', 'e', 'f']
    
    def get_sublist(sta,end):
        return mylist[sta:end+1]
    
    df['col_3'] = df.apply(lambda x: get_sublist(x.col_1, x.col_2), axis=1)
    

    Output of print(df):

      ID  col_1  col_2      col_3
    0  1      0      1     [a, b]
    1  2      2      4  [c, d, e]
    2  3      3      5  [d, e, f]
    

    If your column names contain spaces or share a name with an existing dataframe attribute, you can index with square brackets:

    df['col_3'] = df.apply(lambda x: f(x['col 1'], x['col 2']), axis=1)
    

提交回复
热议问题