remove entries with nan values in python dictionary

后端未结

关注

 4  1480

[愿得一人] 2021-01-18 12:47

I have the foll. dictionary in python:

OrderedDict([(30, (\'A1\', 55.0)), (31, (\'A2\', 125.0)), (32, (\'A3\', 180.0)), (43, (\'A4\', nan))])

4条回答

孤街浪徒 (楼主)

2021-01-18 13:09
Since you have pandas, you can leverage pandas' pd.Series.notnull function here, which works with mixed dtypes.
```
>>> import pandas as pd
>>> {k: v for k, v in dict_cg.items() if pd.Series(v).notna().all()}
{30: ('A1', 55.0), 31: ('A2', 125.0), 32: ('A3', 180.0)}
```
This is not part of the answer, but may help you understand how I've arrived at the solution. I came across some weird behaviour when trying to solve this question, using pd.notnull directly.

Take dict_cg[43].
```
>>> dict_cg[43]
('A4', nan)
```
pd.notnull does not work.
```
>>> pd.notnull(dict_cg[43])
True
```
It treats the tuple as a single value (rather than an iterable of values). Furthermore, converting this to a list and then testing also gives an incorrect answer.
```
>>> pd.notnull(list(dict_cg[43]))
array([ True,  True])
```
Since the second value is nan, the result I'm looking for should be [True, False]. It finally works when you pre-convert to a Series:
```
>>> pd.Series(dict_cg[43]).notnull() 
0     True
1    False
dtype: bool
```
So, the solution is to Series-ify it and then test the values.

Along similar lines, another (admittedly roundabout) solution is to pre-convert to an object dtype numpy array, and pd.notnull will work directly:
```
>>> pd.notnull(np.array(dict_cg[43], dtype=object))
Out[151]: array([True,  False])
```
I imagine that pd.notnull directly converts dict_cg[43] to a string array under the covers, rendering the NaN as a string "nan", so it is no longer a "null" value.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...