Get index of a row of a pandas dataframe as an integer

前端未结

关注

 4  1844

Assume an easy dataframe, for example

    A         B
0   1  0.810743
1   2  0.595866
2   3  0.154888
3   4  0.472721
4   5  0.894525
5   6  0.978174
6   7


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  耶瑟儿～        
                
              
                            
                2020-12-02 10:21
              
            
            
                                                                       
The nature of wanting to include the row where A == 5 and all rows upto but not including the row where A == 8 means we will end up using iloc (loc includes both ends of slice).

In order to get the index labels we use idxmax.  This will return the first position of the maximum value.  I run this on a boolean series where A == 5 (then when A == 8) which returns the index value of when A == 5 first happens (same thing for A == 8).

Then I use searchsorted to find the ordinal position of where the index label (that I found above) occurs.  This is what I use in iloc.

i5, i8 = df.index.searchsorted([df.A.eq(5).idxmax(), df.A.eq(8).idxmax()])
df.iloc[i5:i8]






numpy  

you can further enhance this by using the underlying numpy objects the analogous numpy functions.  I wrapped it up into a handy function.

def find_between(df, col, v1, v2):
    vals = df[col].values
    mx1, mx2 = (vals == v1).argmax(), (vals == v2).argmax()
    idx = df.index.values
    i1, i2 = idx.searchsorted([mx1, mx2])
    return df.iloc[i1:i2]

find_between(df, 'A', 5, 8)






timing


                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  忘了有多久        
                
              
                            
                2020-12-02 10:22
              
            
            
                                                                       
To answer the original question on how to get the index as an integer for the desired selection, the following will work :

df[df['A']==5].index.item()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  悲哀的现实        
                
              
                            
                2020-12-02 10:29
              
            
            
                                                                       
Little sum up for searching by row:
This can be useful if you don't know the column values or if columns have non-numeric values
if u want get index number as integer u can also do:
item = df[4:5].index.item()
print(item)
4

it also works in numpy / list:
numpy = df[4:7].index.to_numpy()[0]
lista = df[4:7].index.to_list()[0]

in [x] u pick number in range [4:7], for example if u want 6:
numpy = df[4:7].index.to_numpy()[2]
print(numpy)
6

for DataFrame:
df[4:7]

    A          B
4   5   0.894525
5   6   0.978174
6   7   0.859449

or:
df[(df.index>=4) & (df.index<7)]

    A          B
4   5   0.894525
5   6   0.978174
6   7   0.859449   

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北恋        
                
              
                            
                2020-12-02 10:36
              
            
            
                                                                       
The easier is add [0] - select first value of list with one element:

dfb = df[df['A']==5].index.values.astype(int)[0]
dfbb = df[df['A']==8].index.values.astype(int)[0]




dfb = int(df[df['A']==5].index[0])
dfbb = int(df[df['A']==8].index[0])


But if possible some values not match, error is raised, because first value not exist.

Solution is use next with iter for get default parameetr if values not matched:

dfb = next(iter(df[df['A']==5].index), 'no match')
print (dfb)
4

dfb = next(iter(df[df['A']==50].index), 'no match')
print (dfb)
no match


Then it seems need substract 1:

print (df.loc[dfb:dfbb-1,'B'])
4    0.894525
5    0.978174
6    0.859449
Name: B, dtype: float64


Another solution with boolean indexing or query:

print (df[(df['A'] >= 5) & (df['A'] < 8)])
   A         B
4  5  0.894525
5  6  0.978174
6  7  0.859449

print (df.loc[(df['A'] >= 5) & (df['A'] < 8), 'B'])
4    0.894525
5    0.978174
6    0.859449
Name: B, dtype: float64




print (df.query('A >= 5 and A < 8'))
   A         B
4  5  0.894525
5  6  0.978174
6  7  0.859449

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复