Boost::multi_array performance question

前端未结

关注

 16  2035

I am trying to compare the performance of boost::multi_array to native dynamically allocated arrays, with the following test program:

#include


                      
              相关标签:


      
      
        
          16条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  广开言路        
                
              
                            
                2020-12-04 12:01
              
            
            
                                                                       
Are you building release or debug?

If running in debug mode, the boost array might be really slow because their template magic isn't inlined properly giving lots of overhead in function calls. I'm not sure how multi array is implemented though so this might be totally off :)

Perhaps there is some difference in storage order as well so you might be having your image stored column by column and writing it row by row. This would give poor cache behavior and may slow down things.

Try switching the order of the X and Y loop and see if you gain anything.
There is some info on the storage ordering here: 
http://www.boost.org/doc/libs/1_37_0/libs/multi_array/doc/user.html

EDIT:
Since you seem to be using the two dimensional array for image processing you might be interested in checking out boosts image processing library gil.

It might have arrays with less overhead that works perfectly for your situation.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2020-12-04 12:02
              
            
            
                                                                       
I've compiled the code (with slight modifications) under VC++ 2010 with optimisation turned on ("Maximize Speed" together with inlining "Any Suitable" functions and "Favoring fast code") and got times 0.015/0.391. I've generated assembly listing and, though I'm a terrible assembly noob, there's one line inside the boost-measuring loop which doesn't look good to me: 

call    ??A?$multi_array_ref@N$01@boost@@QAE?AV?$sub_array@N$00@multi_array@detail@1@H@Z ; boost::multi_array_ref<double,2>::operator[]


One of the [] operators didn't get inlined! The called procedure makes another call, this time to multi_array::value_accessor_n<...>::access<...>():

call    ??$access@V?$sub_array@N$00@multi_array@detail@boost@@PAN@?$value_accessor_n@N$01@multi_array@detail@boost@@IBE?AV?$sub_array@N$00@123@U?$type@V?$sub_array@N$00@multi_array@detail@boost@@@3@HPANPBIPBH3@Z ; boost::detail::multi_array::value_accessor_n<double,2>::access<boost::detail::multi_array::sub_array<double,1>,double *>


Altogether, the two procedures are quite a lot of code for simply accessing a single element in the array. My general impression is that the library is so complex and high-level that Visual Studio is unable to optimise it as much as we would like (posters using gcc apparently have got better results). 

IMHO, a good compiler really should have inlined and optimised the two procedures - both are pretty short and straight-forward, don't contain any loops etc. A lot of time may be wasted simply on passing their arguments and results. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  灰色年华        
                
              
                            
                2020-12-04 12:06
              
            
            
                                                                       
I think I know what the problem is...maybe.

In order for the boost implementation to have a syntax like: matrix[x][y]. that means that matrix[x] has to return a reference to an object which acts like a 1D array column, at which point reference[y] gives you your element.

The problem here is that you are iterating in row major order (which is typical in c/c++ since native arrays are row major IIRC. The compiler has to re-execute matrix[x] for each y in this case. If you iterated in column major order when using the boost matrix, you may see better performance.

Just a theory.

EDIT: on my linux system (with some minor changes) I tested my theory, and did show some performance improvement by switching x and y, but it was still slower than a native array. This might be a simple issue of the compiler not being able to optimize away the temporary reference type.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  太阳男子        
                
              
                            
                2020-12-04 12:08
              
            
            
                                                                       
A similar question was asked and answered here:

http://www.codeguru.com/forum/archive/index.php/t-300014.html

The short answer is that it is easiest for the compiler to optimize the simple arrays, and not so easy to optimize the Boost version.  Hence, a particular compiler may not give the Boost version all the same optimization benefits.

Compilers can also vary in how well they will optimize vs. how conservative they will be (e.g. with templated code or other complications).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一向        
                
              
                            
                2020-12-04 12:10
              
            
            
                                                                       
I modified the above code in visual studio 2008 v9.0.21022 and applied the container routines from the Numerical Recipe routines for C and C++

http://www.nrbook.com/nr3/ using their licensed routines dmatrix and MatDoub respectively

dmatrix uses the out of date syntax malloc operator and is not recommended...
MatDoub uses the New command

The speed in seconds are in Release version:

Boost: 0.437

Native: 0.032

Numerical Recipes C: 0.031

Numerical recipes C++: 0.031

So from the above blitz looks like the best free alternative.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不思量自难忘°        
                
              
                            
                2020-12-04 12:13
              
            
            
                                                                       
I am wondering two things: 

1) bounds check:
define the BOOST_DISABLE_ASSERTS preprocessor macro prior to including multi_array.hpp in your application. This turns off bound checking. not sure if this this is disables when NDEBUG is.

2) base index:
 MultiArray can index arrays from bases different from 0. That means that multi_array stores a base number (in each dimension) and uses a more complicated formula to obtain the exact location in memory, I am wondering if it is all about that.

Otherwise I don't understand why multiarray should be slower than C-arrays.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复