Conditional Logic on Pandas DataFrame

后端 未结 4 543
遥遥无期
遥遥无期 2020-12-01 03:23

How to apply conditional logic to a Pandas DataFrame.

See DataFrame shown below,

   data desired_output
0     1          False
1     2          Fals         


        
相关标签:
4条回答
  • 2020-12-01 04:05
    In [34]: import pandas as pd
    
    In [35]: import numpy as np
    
    In [36]:  df = pd.DataFrame([1,2,3,4], columns=["data"])
    
    In [37]: df
    Out[37]: 
       data
    0     1
    1     2
    2     3
    3     4
    
    In [38]: df["desired_output"] = np.where(df["data"] <2.5, "False", "True")
    
    In [39]: df
    Out[39]: 
       data desired_output
    0     1          False
    1     2          False
    2     3           True
    3     4           True
    
    0 讨论(0)
  • 2020-12-01 04:16
    In [1]: df
    Out[1]:
       data
    0     1
    1     2
    2     3
    3     4
    

    You want to apply a function that conditionally returns a value based on the selected dataframe column.

    In [2]: df['data'].apply(lambda x: 'true' if x <= 2.5 else 'false')
    Out[2]:
    0     true
    1     true
    2    false
    3    false
    Name: data
    

    You can then assign that returned column to a new column in your dataframe:

    In [3]: df['desired_output'] = df['data'].apply(lambda x: 'true' if x <= 2.5 else 'false')
    
    In [4]: df
    Out[4]:
       data desired_output
    0     1           true
    1     2           true
    2     3          false
    3     4          false
    
    0 讨论(0)
  • 2020-12-01 04:20

    In this specific example, where the DataFrame is only one column, you can write this elegantly as:

    df['desired_output'] = df.le(2.5)
    

    le tests whether elements are less than or equal 2.5, similarly lt for less than, gt and ge.

    0 讨论(0)
  • 2020-12-01 04:23

    Just compare the column with that value:

    In [9]: df = pandas.DataFrame([1,2,3,4], columns=["data"])
    
    In [10]: df
    Out[10]: 
       data
    0     1
    1     2
    2     3
    3     4
    
    In [11]: df["desired"] = df["data"] > 2.5
    In [11]: df
    Out[12]: 
       data desired
    0     1   False
    1     2   False
    2     3    True
    3     4    True
    
    0 讨论(0)
提交回复
热议问题