Find matching rows in 2 dimensional numpy array

前端未结

关注

 4  1147

I would like to get the index of a 2 dimensional Numpy array that matches a row. For example, my array is this:

vals = np.array([[0, 0],
                 [1


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  名媛妹妹        
                
              
                            
                2020-11-30 02:09
              
            
            
                                                                       
Using the numpy_indexed package, you can simply write:

import numpy_indexed as npi
print(np.flatnonzero(npi.contains([[0, 1]], vals)))

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  Happy的楠姐        
                
              
                            
                2020-11-30 02:20
              
            
            
                                                                       
In [5]: np.where((vals[:,0] == 0) & (vals[:,1]==1))[0]
Out[5]: array([ 3, 15])




I'm not sure why, but this is significantly faster than

np.where((vals == (0, 1)).all(axis=1)):

In [34]: vals2 = np.tile(vals, (1000,1))

In [35]: %timeit np.where((vals2 == (0, 1)).all(axis=1))[0]
1000 loops, best of 3: 808 µs per loop

In [36]: %timeit np.where((vals2[:,0] == 0) & (vals2[:,1]==1))[0]
10000 loops, best of 3: 152 µs per loop

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2020-11-30 02:29
              
            
            
                                                                       
You need the np.where function to get the indexes:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)


Or, as the documentation states:


  If only condition is given, return condition.nonzero()


You could directly call .nonzero() on the array returned by .all:

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)


To dissassemble that:

>>> vals == (0, 1)
array([[ True, False],
       [False, False],
       ...
       [ True, False],
       [False, False],
       [False, False]], dtype=bool)


and calling the .all method on that array (with axis=1) gives you True where both are True:

>>> (vals == (0, 1)).all(axis=1)
array([False, False, False,  True, False, False, False, False, False,
       False, False, False, False, False, False,  True, False, False,
       False, False, False, False, False, False], dtype=bool)


and to get which indexes are True:

>>> np.where((vals == (0, 1)).all(axis=1))
(array([ 3, 15]),)


or

>>> (vals == (0, 1)).all(axis=1).nonzero()
(array([ 3, 15]),)




I find my solution a bit more readable, but as unutbu points out, the following may be faster, and returns the same value as (vals == (0, 1)).all(axis=1):

>>> (vals[:, 0] == 0) & (vals[:, 1] == 1)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  半阙折子戏        
                
              
                            
                2020-11-30 02:33
              
            
            
                                                                       
I believe the numpy_indexed package could do with more clarity in documentation so we could see how best to use it. The npi.contains example above looks like something I could use right away.
I have an M X 2 array of integers A and an N X 2 array of integers B.
I would like to create an M X 1 array C containing indices of B where A[indxA,0] & A[indxA,1] == B[indxB,0] & B[indxB,1]. Other elements of C = -1.
I was thinking that npi.contains might work, but did not find adequate information in the github site to determine whether it could, and if it did, how should I use it.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复