Getting the last non-nan index of a sorted numpy matrix or pandas dataframe

后端未结

关注

 5  1787

Given a numpy array (or pandas dataframe) like this:

import numpy as np

a = np.array([
[1,      1,      1,    0.5, np.nan, np.nan, np.nan],
[1,      1,


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  深忆病人        
                
              
                            
                2020-12-21 05:23
              
            
            
                                                                       
pandas.Series has a last_valid_index method:

pd.DataFrame(a.T).apply(pd.Series.last_valid_index)
Out: 
0    3
1    2
2    6
3    3
4    0
5    3
dtype: int64

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  渐次进展        
                
              
                            
                2020-12-21 05:24
              
            
            
                                                                       
check if not nan then reverse order of columns and take argmax then subtract from number of columns

a.shape[1] - (~np.isnan(a))[:, ::-1].argmax(1) - 1

array([3, 2, 6, 3, 0, 3])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  轮回少年        
                
              
                            
                2020-12-21 05:36
              
            
            
                                                                       
This solution doesn't require the array to be sorted.  It just returns the last non nan item along axis 1. 

(~np.isnan(a)).cumsum(1).argmax(1)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情歌与酒        
                
              
                            
                2020-12-21 05:38
              
            
            
                                                                       
Well here is a way to do it. Probably not the most efficient though:

list(map(lambda x: [i for i, x_ in enumerate(x) if not np.isnan(x_)][-1], a))


Also it will fail if any row is fully 'nan' because python will try to do getitem on an empty list.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  天命终不由人        
                
              
                            
                2020-12-21 05:46
              
            
            
                                                                       
If all nan values have been sorted to the end of each row, you can do something like this:

(~np.isnan(a)).sum(axis = 1) - 1
# array([3, 2, 6, 3, 0, 3])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复