Getting the last non-nan index of a sorted numpy matrix or pandas dataframe

后端 未结 5 1787
野的像风
野的像风 2020-12-21 04:50

Given a numpy array (or pandas dataframe) like this:

import numpy as np

a = np.array([
[1,      1,      1,    0.5, np.nan, np.nan, np.nan],
[1,      1,              


        
相关标签:
5条回答
  • 2020-12-21 05:23

    pandas.Series has a last_valid_index method:

    pd.DataFrame(a.T).apply(pd.Series.last_valid_index)
    Out: 
    0    3
    1    2
    2    6
    3    3
    4    0
    5    3
    dtype: int64
    
    0 讨论(0)
  • 2020-12-21 05:24

    check if not nan then reverse order of columns and take argmax then subtract from number of columns

    a.shape[1] - (~np.isnan(a))[:, ::-1].argmax(1) - 1
    
    array([3, 2, 6, 3, 0, 3])
    
    0 讨论(0)
  • 2020-12-21 05:36

    This solution doesn't require the array to be sorted. It just returns the last non nan item along axis 1.

    (~np.isnan(a)).cumsum(1).argmax(1)
    
    0 讨论(0)
  • 2020-12-21 05:38

    Well here is a way to do it. Probably not the most efficient though:

    list(map(lambda x: [i for i, x_ in enumerate(x) if not np.isnan(x_)][-1], a))
    

    Also it will fail if any row is fully 'nan' because python will try to do getitem on an empty list.

    0 讨论(0)
  • 2020-12-21 05:46

    If all nan values have been sorted to the end of each row, you can do something like this:

    (~np.isnan(a)).sum(axis = 1) - 1
    # array([3, 2, 6, 3, 0, 3])
    
    0 讨论(0)
提交回复
热议问题