C++ Array vs vector

前端未结

关注

 8  998

when using C++ vector, time spent is 718 milliseconds, while when I use Array, time is almost 0 milliseconds.

Why so much performance difference?

int


                      
              相关标签:


      
      
        
          8条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  情歌与酒        
                
              
                            
                2021-01-03 06:32
              
            
            
                                                                       
When profiling code, make sure you are comparing similar things.

vector<int> v(size*size); 


initializes each element in the vector,

int arr[size*size]; 


doesn't. Try

int arr[size * size];
memset( arr, 0, size * size );


and measure again...
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2021-01-03 06:36
              
            
            
                                                                       
Your array arr is allocated on the stack, i.e., the compiler has calculated the necessary space at compile time. At the beginning of the method, the compiler will insert an assembler statement like

sub esp, 10000*10000*sizeof(int)


which means the stack pointer (esp) is decreased by 10000 * 10000 * sizeof(int) bytes to make room for an array of 10000² integers. This operation is almost instant. 

The vector is heap allocated and heap allocation is much more expensive. When the vector allocates the required memory, it has to ask the operating system for a contiguous chunk of memory and the operating system will have to perform significant work to find this chunk of memory.

As Andreas says in the comments, all your time is spent in this line:

vector<int> v(size*size); 


Accessing the vector inside the loop is just as fast as for the array.

For an additional overview see e.g.


[What and where are the stack and heap?
[http://computer.howstuffworks.com/c28.htm][2]
[http://www.cprogramming.com/tutorial/virtual_memory_and_heaps.html][3]


Edit:

After all the comments about performance optimizations and compiler settings, I did some measurements this morning. I had to set size=3000 so I did my measurements with roughly a tenth of the original entries. All measurements performed on a 2.66 GHz Xeon:


With debug settings in Visual Studio 2008 (no optimization, runtime checks, and debug runtime) the vector test took 920 ms compared to 0 ms for the array test. 

98,48 % of the total time was spent in vector::operator[], i.e., the time was indeed spent on the runtime checks. 
With full optimization, the vector test needed 56 ms (with a tenth of the original number of entries) compared to 0 ms for the array. 

The vector ctor required 61,72 % of the total application running time. 


So I guess everybody is right depending on the compiler settings used. The OP's timing suggests an optimized build or an STL without runtime checks. 

As always, the morale is: profile first, optimize second. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2021-01-03 06:39
              
            
            
                                                                       
If you are compiling this with a Microsoft compiler, to make it a fair comparison you need to switch off iterator security checks and iterator debugging, by defining _SECURE_SCL=0 and _HAS_ITERATOR_DEBUGGING=0.

Secondly, the constructor you are using initialises each vector value with zero, and you are not memsetting the array to zero before filling it. So you are traversing the vector twice.

Try:

vector<int> v; 
v.reserve(size*size);

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一个人的身影        
                
              
                            
                2021-01-03 06:40
              
            
            
                                                                       
Two things. One, operator[] is much slower for vector. Two, vector in most implementations will behave weird at times when you add in one element at a time. I don't mean just that it allocates more memory but it does some genuinely bizarre things at times.

The first one is the main issue. For a mere million bytes, even reallocating the memory a dozen times should not take long (it won't do it on every added element).

In my experiments, preallocating doesn't change its slowness much. When the contents are actual objects it basically grinds to a halt if you try to do something simple like sort it.

Conclusion, don't use stl or mfc vectors for anything large or computation heavy. They are implemented poorly/slowly and cause lots of memory fragmentation.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  伪装坚强ぢ        
                
              
                            
                2021-01-03 06:41
              
            
            
                                                                       
To get a fair comparison I think something like the following should be suitable:

#include <sys/time.h>
#include <vector>
#include <iostream>
#include <algorithm>
#include <numeric>


int main()
{
  static size_t const size = 7e6;

  timeval start, end;
  int sum;

  gettimeofday(&start, 0);
  {
    std::vector<int> v(size, 1);
    sum = std::accumulate(v.begin(), v.end(), 0);
  }
  gettimeofday(&end, 0);

  std::cout << "= vector =" << std::endl
        << "(" << end.tv_sec - start.tv_sec
        << " s, " << end.tv_usec - start.tv_usec
        << " us)" << std::endl
        << "sum = " << sum << std::endl << std::endl;

  gettimeofday(&start, 0);
  int * const arr =  new int[size];
  std::fill(arr, arr + size, 1);
  sum = std::accumulate(arr, arr + size, 0);
  delete [] arr;
  gettimeofday(&end, 0);

  std::cout << "= Simple array =" << std::endl
        << "(" << end.tv_sec - start.tv_sec
        << " s, " << end.tv_usec - start.tv_usec
        << " us)" << std::endl
        << "sum = " << sum << std::endl << std::endl;
}


In both cases, dynamic allocation and deallocation is performed, as well as accesses to elements.

On my Linux box:

$ g++ -O2 foo.cpp 
$ ./a.out 
= vector =
(0 s, 21085 us)
sum = 7000000

= Simple array =
(0 s, 21148 us)
sum = 7000000


Both the std::vector<> and array cases have comparable performance.  The point is that std::vector<> can be just as fast as a simple array if your code is structured appropriately.



On a related note switching off optimization makes a huge difference in this case:

$ g++ foo.cpp 
$ ./a.out 
= vector =
(0 s, 120357 us)
sum = 7000000

= Simple array =
(0 s, 60569 us)
sum = 7000000


Many of the optimization assertions made by folks like Neil and jalf are entirely correct.

HTH!

EDIT:  Corrected code to force vector destruction to be included in time measurement.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  慢半拍i        
                
              
                            
                2021-01-03 06:48
              
            
            
                                                                       
Change assignment to eg. arr[i*size+j] = i*j, or some other non-constant expression. I think compiler optimizes away whole loop, as assigned values are never used, or replaces array with some precalculated values, so that loop isn't even executed and you get 0 milliseconds.

Having changed 1 to i*j, i get the same timings for both vector and array, unless pass -O1 flag to gcc, then in both cases I get 0 milliseconds.

So, first of all, double-check whether your loops are actually executed.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复