Python loop over values in dataframe and change to binary values

后端 未结 2 582
隐瞒了意图╮
隐瞒了意图╮ 2021-01-21 06:40

I have a series y in Python with values Accepted and Rejected. I want to create a new dataframe with value 1 for Ac

相关标签:
2条回答
  • 2021-01-21 07:08

    you can use:

    df['dummy'] = df.y.apply(lambda x:  1 if  x == 'Accepted' else 0)
    

    if you want to use a for loop:

    new_dummy_data = []
    
    for value in df.y.values:
        if value == 'Accepted':
            new_dummy_data.append(1)
        else:
            new_dummy_data.append(0)
    
    df['dummy'] = new_dummy_data
    
    0 讨论(0)
  • 2021-01-21 07:13

    Here loop is not necessary, because slow. Better is convert boolean mask to True/False to 0,1 by converting to integers or use numpy.where:

    df['dummy'] =  (df['y']=='Approved').astype(int)
    

    df['dummy'] =  np.where(df['y']=='Approved', 1, 0)
    

    Your solution should be changed (loopy slow solution):

    print (df)
    
    0  Accepted
    1  Rejected
    2  Accepted
    3  Accepted
    4  Accepted
    
    out = []
    for i in range(0,len(df)):
        if df.loc[i, 'y']=='Accepted': 
            out.append(1)
        else: 
            out.append(0)
    
    print (out)
    [1, 0, 1, 1, 1]
    
    df['dummy'] = out
    print (df)
              y  dummy
    0  Accepted      1
    1  Rejected      0
    2  Accepted      1
    3  Accepted      1
    4  Accepted      1
    
    0 讨论(0)
提交回复
热议问题