compute columns based on multiple conditions

后端 未结 2 1695
独厮守ぢ
独厮守ぢ 2021-01-21 14:37

I was reading a blog for conditaion based new computations where new col \'category\' is inserted.

data = {\'name\': [\'Jason\', \'Molly\', \'Tina\', \'Jake\', \         


        
相关标签:
2条回答
  • 2021-01-21 15:08

    You can using pd.cut (BTW , 40 is not old man :-()

    pd.cut(df.age,bins=[0,20,39,np.inf],labels=['kid','young','old'])
    Out[179]: 
    0      old
    1      old
    2    young
    3    young
    4      old
    Name: age, dtype: category
    Categories (3, object): [kid < young < old]
    
    0 讨论(0)
  • 2021-01-21 15:23

    For multiple conditions, you can just use numpy.select instead of numpy.where

    import numpy as np
    
    cond = [df['age'] < 20, df['age'].between(20, 39), df['age'] >= 40]
    choice = ['kid', 'young', 'old']
    
    df['category'] = np.select(cond, choice)
    #    name  age  preTestScore  postTestScore category
    #0  Jason   42             4             25      old
    #1  Molly   52            24             94      old
    #2   Tina   36            31             57    young
    #3   Jake   24             2             62    young
    #4    Amy   73             3             70      old
    
    0 讨论(0)
提交回复
热议问题