Delete element from multi-dimensional numpy array by value

前端未结

关注

 5  1708

Given a numpy array

a = np.array([[0, -1, 0], [1, 0, 0], [1, 0, -1]])

what\'s the fastest way to delete all elements of value -1


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  -上瘾入骨i        
                
              
                            
                2020-12-18 10:55
              
            
            
                                                                       
How about this?

print([[y for y in x if y > -1] for x in a])
[[0, 0], [1, 0, 0], [1, 0]]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  时光说笑        
                
              
                            
                2020-12-18 10:56
              
            
            
                                                                       
Approach #1 : Using NumPy splitting of array -

def split_based(a, val):
    mask = a!=val
    p = np.split(a[mask],mask.sum(1)[:-1].cumsum())
    out = np.array(list(map(list,p)))
    return out


Approach #2 : Using loop comprehension, but minimal work within the loop -

def loop_compr_based(a, val):
    mask = a!=val
    stop = mask.sum(1).cumsum()
    start = np.append(0,stop[:-1])
    am = a[mask].tolist()
    out = np.array([am[start[i]:stop[i]] for i  in range(len(start))])
    return out


Sample run -

In [391]: a
Out[391]: 
array([[ 0, -1,  0],
       [ 1,  0,  0],
       [ 1,  0, -1],
       [-1, -1,  8],
       [ 3,  7,  2]])

In [392]: split_based(a, val=-1)
Out[392]: array([[0, 0], [1, 0, 0], [1, 0], [8], [3, 7, 2]], dtype=object)

In [393]: loop_compr_based(a, val=-1)
Out[393]: array([[0, 0], [1, 0, 0], [1, 0], [8], [3, 7, 2]], dtype=object)


Runtime test -

In [387]: a = np.random.randint(-2,10,(1000,1000))

In [388]: %timeit split_based(a, val=-1)
10 loops, best of 3: 161 ms per loop

In [389]: %timeit loop_compr_based(a, val=-1)
10 loops, best of 3: 29 ms per loop

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  说谎        
                
              
                            
                2020-12-18 10:57
              
            
            
                                                                       
Another method you might consider:

def iterative_numpy(a):
    mask = a != 1
    out = np.array([ a[i,mask[i]] for i xrange(a.shape[0]) ])
    return out


Divakar's method loop_compr_based calculates sums along the rows of mask and a cumulative sum of that result. This method avoids such summations but still has to iterate through the rows of a. It also returns an array of arrays. This has the annoyance that out has to be indexed with the syntax out[1][2] rather than out[1,2]. Comparing the times with a matrix random integer matrices:

In [4]: a = np.random.random_integers(-1,1, size = (3,30))

In [5]: %timeit iterative_numpy(a)
100000 loops, best of 3: 11.1 us per loop

In [6]: %timeit loop_compr_based(a)
10000 loops, best of 3: 20.2 us per loop

In [7]: a = np.random.random_integers(-1,1, size = (30,3))

In [8]: %timeit iterative_numpy(a)
10000 loops, best of 3: 59.5 us per loop

In [9]: %timeit loop_compr_based(a)
10000 loops, best of 3: 30.8 us per loop

In [10]: a = np.random.random_integers(-1,1, size = (30,30))

In [11]: %timeit iterative_numpy(a)
10000 loops, best of 3: 64.6 us per loop

In [12]: %timeit loop_compr_based(a)
10000 loops, best of 3: 36 us per loop


When there are more columns than rows, iterative_numpy wins out. When there are more rows than columns, loop_compr_based wins but transposing a first will improve the performance of both methods. When the dimensions are comparably the same, loop_compr_based is best.

Important Side Discussion

Outside of the implementation, it's important to note that any numpy array which has a non-uniform shape is not an actual array in the sense that the values do not occupy a contiguous section of memory and further, the usual array operations will not work as expected.

As an example:

>>> a = np.array([[1,2,3],[1,2],[1]])
>>> a*2
array([[1, 2, 3, 1, 2, 3], [1, 2, 1, 2], [1, 1]], dtype=object)


Notice that numpy actually informs us that this is not the usual numpy array with the note dtype=object.

Thus it might be best to just make a list of numpy arrays and use them accordingly.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  旧巷少年郎        
                
              
                            
                2020-12-18 11:00
              
            
            
                                                                       
For almost everything you might want to do with such an array, you can use a masked array

a = np.array([[0, -1, 0], [1, 0, 0], [1, 0, -1]])

b=np.ma.masked_equal(a,-1)

b
Out[5]: 
masked_array(data =
 [[0 -- 0]
 [1 0 0]
 [1 0 --]],
             mask =
 [[False  True False]
 [False False False]
 [False False  True]],
       fill_value = -1)


If you really want the ragged array, it can be .compressed() by line

c=np.array([b[i].compressed() for i in range(b.shape[0])])

c
Out[10]: array([array([0, 0]), array([1, 0, 0]), array([1, 0])], dtype=object)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  遇见更好的自我        
                
              
                            
                2020-12-18 11:08
              
            
            
                                                                       
Use indexes = np.where(a == -1) to get indexes of elements
Find indices of elements equal to zero from numpy array

Then delete specific elements by index with np.delete(your_array, indexes)
How to remove specific elements in a numpy array
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复