How to replace 'any strings' with nan in pandas DataFrame using a boolean mask?

前端 未结 3 1321
自闭症患者
自闭症患者 2020-12-30 06:48

I have a 227x4 DataFrame with country names and numerical values to clean (wrangle ?).

Here\'s an abstraction of the DataFrame:

import pandas as pd
i         


        
相关标签:
3条回答
  • 2020-12-30 06:55

    Use numeric with errors coerce i.e

    cols = ['Measure1','Measure2']
    df[cols] = df[cols].apply(pd.to_numeric,errors='coerce')
    
     Country Name  Measure1  Measure2
    0          PuB       7.0       6.0
    1          JHq       2.0       NaN
    2          opE       4.0       3.0
    3          pxl       3.0       6.0
    4          ouP       NaN       4.0
    5          qZR       4.0       6.0
    
    0 讨论(0)
  • 2020-12-30 07:09

    Assign only columns of interest:

    cols = ['Measure1','Measure2']
    mask = df[cols].applymap(lambda x: isinstance(x, (int, float)))
    
    df[cols] = df[cols].where(mask)
    print (df)
      Country Name Measure1 Measure2
    0          uFv        7        8
    1          vCr        5      NaN
    2          qPp        2        6
    3          QIC       10       10
    4          Suy      NaN        8
    5          eFS        6        4
    

    A meta-question, Is it normal that it takes me more than 3 hours to formulate a question here (including research) ?

    In my opinion yes, create good question is really hard.

    0 讨论(0)
  • 2020-12-30 07:19
    cols = ['Measure1','Measure2']
    df[cols] = df[cols].applymap(lambda x: x if not isinstance(x, str) else np.nan)
    

    or

    df[cols] = df[cols].applymap(lambda x: np.nan if isinstance(x, str) else x)
    

    Result:

    In [22]: df
    Out[22]:
      Country Name  Measure1  Measure2
    0          nBl      10.0       9.0
    1          Ayp       8.0       NaN
    2          diz       4.0       1.0
    3          aad       7.0       3.0
    4          JYI       NaN      10.0
    5          BJO       9.0       8.0
    
    0 讨论(0)
提交回复
热议问题