Pandas variable creation using multiple If-else

前端 未结 1 1008
执念已碎
执念已碎 2021-02-04 22:48

Need help with Pandas multiple IF-ELSE statements. I have a test dataset (titanic) as follows:

ID  Survived    Pclass  Name    Sex Age
1   0   3   Braund  male          


        
相关标签:
1条回答
  • 2021-02-04 23:20

    Instead of looping through the rows using df.iterrows (which is relatively slow), you can assign the desired values to the Prediction column in one assignment:

    In [27]: df['Prediction'] = ((df['Sex']=='female') | ((df['Pclass']==1) & (df['Age']<18))).astype('int')
    
    In [29]: df['Prediction']
    Out[29]: 
    0    0
    1    1
    2    1
    3    1
    4    0
    5    0
    6    0
    7    0
    Name: Prediction, dtype: int32
    

    For your first approach, remember that df['Prediction'] represents an entire column of df, so df['Prediction']=1 assigns the value 1 to each row in that column. Since df['Prediction']=0 was the last assignment, the entire column ended up being filled with zeros.

    For your second approach, note that you need to use & not and to perform an elementwise logical-and operation on two NumPy arrays or Pandas NDFrames. Thus, you could use

    In [32]: np.where(df['Sex']=='female', 1, np.where((df['Pclass']==1)&(df['Age']<18), 1, 0))
    Out[32]: array([0, 1, 1, 1, 0, 0, 0, 0])
    

    though I think it is much simpler to just use | for logical-or and & for logical-and:

    In [34]: ((df['Sex']=='female') | ((df['Pclass']==1) & (df['Age']<18)))
    Out[34]: 
    0    False
    1     True
    2     True
    3     True
    4    False
    5    False
    6    False
    7    False
    dtype: bool
    
    0 讨论(0)
提交回复
热议问题