Weird null checking behaviour by pd.notnull

巧了我就是萌 提交于 2021-01-27 16:07:00

问题


This is essentially a rehashing of the content of my answer here.

I came across some weird behaviour when trying to solve this question, using pd.notnull.

Consider

x = ('A4', nan)

I want to check which of these items are null. Using np.isnan directly will throw a TypeError (but I've figured out how to solve that).

Using pd.notnull does not work.

>>> pd.notnull(x)
True

It treats the tuple as a single value (rather than an iterable of values). Furthermore, converting this to a list and then testing also gives an incorrect answer.

>>> pd.notnull(list(x))
array([ True,  True])

Since the second value is nan, the result I'm looking for should be [True, False]. It finally works when you pre-convert to a Series:

>>> pd.Series(x).notnull() 
0     True
1    False
dtype: bool

So, the solution is to Series-ify it and then test the values.

Along similar lines, another (admittedly roundabout) solution is to pre-convert to an object dtype numpy array, and pd.notnull or np.isnan will work directly:

>>> pd.notnull(np.array(x, dtype=object))
Out[151]: array([True,  False])

I imagine that pd.notnull directly converts x to a string array under the covers, rendering the NaN as a string "nan", so it is no longer a "null" value.

Is pd.notnull doing the same thing here? Or is there something else going on under the covers that I should be aware of?

Notes

In [156]: pd.__version__
Out[156]: '0.22.0'

回答1:


Here is the issue related to this behavior: https://github.com/pandas-dev/pandas/issues/20675.

In short, if argument passed to notnull is of type list, internally it is converted to np.array with np.asarray method. This bug occured, because, if no dtype specified, numpy converts np.nan to string(which is not recognized by pd.isnull as null value):

a = ['A4', np.nan]
np.asarray(a)
# array(['A4', 'nan'], dtype='<U3')

This problem was fixed in version 0.23.0, by calling np.asarray with dtype=object.



来源:https://stackoverflow.com/questions/51035790/weird-null-checking-behaviour-by-pd-notnull

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!