Efficiently checking if arbitrary object is NaN in Python / numpy / pandas?

后端 未结 2 1923
执笔经年
执笔经年 2020-12-02 10:40

My numpy arrays use np.nan to designate missing values. As I iterate over the data set, I need to detect such missing values and handle them in special ways.

相关标签:
2条回答
  • 2020-12-02 11:23

    Is your type really arbitrary? If you know it is just going to be a int float or string you could just do

     if val.dtype == float and np.isnan(val):
    

    assuming it is wrapped in numpy , it will always have a dtype and only float and complex can be NaN

    0 讨论(0)
  • 2020-12-02 11:27

    pandas.isnull() (also pd.isna(), in newer versions) checks for missing values in both numeric and string/object arrays. From the documentation, it checks for:

    NaN in numeric arrays, None/NaN in object arrays

    Quick example:

    import pandas as pd
    import numpy as np
    s = pd.Series(['apple', np.nan, 'banana'])
    pd.isnull(s)
    Out[9]: 
    0    False
    1     True
    2    False
    dtype: bool
    

    The idea of using numpy.nan to represent missing values is something that pandas introduced, which is why pandas has the tools to deal with it.

    Datetimes too (if you use pd.NaT you won't need to specify the dtype)

    In [24]: s = Series([Timestamp('20130101'),np.nan,Timestamp('20130102 9:30')],dtype='M8[ns]')
    
    In [25]: s
    Out[25]: 
    0   2013-01-01 00:00:00
    1                   NaT
    2   2013-01-02 09:30:00
    dtype: datetime64[ns]``
    
    In [26]: pd.isnull(s)
    Out[26]: 
    0    False
    1     True
    2    False
    dtype: bool
    
    0 讨论(0)
提交回复
热议问题