First non-null value per row from a list of Pandas columns

前端未结

关注

 9  1178

If I\'ve got a DataFrame in pandas which looks something like:

    A   B   C
0   1 NaN   2
1 NaN   3 NaN
2 NaN   4   5
3 NaN NaN NaN

How ca

相关标签:

9条回答

夕颜

2020-11-27 20:07

df=pandas.DataFrame({'A':[1, numpy.nan, numpy.nan, numpy.nan], 'B':[numpy.nan, 3, 4, numpy.nan], 'C':[2, numpy.nan, 5, numpy.nan]})

df
     A    B    C
0  1.0  NaN  2.0
1  NaN  3.0  NaN
2  NaN  4.0  5.0
3  NaN  NaN  NaN

df.apply(lambda x: numpy.nan if all(x.isnull()) else x[x.first_valid_index()], axis=1).tolist()
[1.0, 3.0, 4.0, nan]

0 讨论(0)

感动是毒

2020-11-27 20:11

This is nothing new, but it's a combination of the best bits of @yangie's approach with a list comprehension, and @EdChum's df.apply approach that I think is easiest to understand.

First, which columns to we want to pick our values from?

In [95]: pick_cols = df.apply(pd.Series.first_valid_index, axis=1)

In [96]: pick_cols
Out[96]: 
0       A
1       B
2       B
3    None
dtype: object

Now how do we pick the values?

In [100]: [df.loc[k, v] if v is not None else None 
    ....:     for k, v in pick_cols.iteritems()]
Out[100]: [1.0, 3.0, 4.0, None]

This is ok, but we really want the index to match that of the original DataFrame:

In [98]: pd.Series({k:df.loc[k, v] if v is not None else None
   ....:     for k, v in pick_cols.iteritems()})
Out[98]: 
0     1
1     3
2     4
3   NaN
dtype: float64

0 讨论(0)

猫巷女王i

2020-11-27 20:11
groupby in axis=1

If we pass a callable that returns the same value, we group all columns together. This allows us to use groupby.agg which gives us the first method that makes this easy
```
df.groupby(lambda x: 'Z', 1).first()

     Z
0  1.0
1  3.0
2  4.0
3  NaN
```
This returns a dataframe with the column name of the thing I was returning in my callable

lookup, notna, and idxmax
```
df.lookup(df.index, df.notna().idxmax(1))

array([ 1.,  3.,  4., nan])
```
argmin and slicing
```
v = df.values
v[np.arange(len(df)), np.isnan(v).argmin(1)]

array([ 1.,  3.,  4., nan])
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2

First non-null value per row from a list of Pandas columns

`groupby` in `axis=1`

`lookup`, `notna`, and `idxmax`

`argmin` and slicing

First non-null value per row from a list of Pandas columns

groupby in axis=1

lookup, notna, and idxmax

argmin and slicing

`groupby` in `axis=1`

`lookup`, `notna`, and `idxmax`

`argmin` and slicing