Is it expected for numba's efficient square euclidean distance code to be slower than numpy's efficient counterpart?

后端未结

关注

 1  713

I modify the most efficient code from (Why this numba code is 6x slower than numpy code?) so that it can handle x1 being (n, m)

@nb.njit(fastmath=True,parallel=T


                      
              相关标签:


      
      
        
          1条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2021-01-24 05:22
              
            
            
                                                                       
Yes, that is expected.

The first thing you must be aware of: dot-product is the working horse of the numpy-version, here for slightly smaller arrays:

>>> def only_dot(x1, x2):
        return - 2*np.dot(x1, x2.T)

>>> a = np.zeros((1000,512), dtype=np.float32)
>>> b = np.zeros((100, 512), dtype=np.float32)

>>> %timeit(euclidean_distance_square_einsum(a,b))
6.08 ms ± 312 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit(euclidean_only_dot(a,b))
5.25 ms ± 330 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


i.e. 85% of the time are spent in it.

When you look at your numba-code, that looks like a somewhat strange/unusual/more complicated version of matrix-matrix-multiplication - one could see for example the same three loops.

So basically, you are trying to beat one of the best optimized algorithms out there. Here is for example somebody trying to do it and failing. My installation uses Intel's MKL-version, which must be more sophisticated than the default-implementation, which can be found here. 

Sometimes, after the whole fun one had, one has to admit that the own "reinvented wheel" is not as good as the state-of-the-art wheel... but only then one can truly appreciate its performance.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复