Vector storage in C++

前端未结

关注

 5  537

I wish to store a large vector of d-dimensional points (d fixed and small: <10).

If I define a Point as vector, I think a v


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  粉色の甜心        
                
              
                            
                2021-02-05 01:12
              
            
            
                                                                       
If you define your Point as having contiguous data storage (e.g. struct Point { int a; int b; int c; } or using std::array), then std::vector<Point> will store the Points in contiguous memory locations, so your memory layout will be:

p0.a, p0.b, p0.c, p1.a, p1.b, p1.c, ..., p(N-1).a, p(N-1).b, p(N-1).c


On the other hand, if you define Point as a vector<int>, then a vector<Point> has the layout of vector<vector<int>>, which is not contiguous, as vector stores pointers to dynamically allocated memory. So you have contiguity for single Points, but not for the whole structure.

The first solution is much more efficient than the second (as modern CPUs love accessing contiguous memory locations).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  [愿得一人]        
                
              
                            
                2021-02-05 01:17
              
            
            
                                                                       
The only way to be 100% sure how your data is structured is to fully implement own memory handling.. 

However, there are many libraries that implement matrices and matrix operations that you can check out. Some have documented information about contiguous memory, reshape etc. (e.g. OpenCV Mat).

Note that in general you can not trust that an array of Points will be contiguous. This is due to alignment, allocation block header etc. For example consider

struct Point {
   char x,y,z;
};

Point array_of_points[3];


Now if you try to 'reshape', that is, iterate between Point elements relaying on the fact that points are adjacent in the container - than it is most likely to fail:

(char *)(&array_of_points[0].z) != (char *)(&array_of_points[1].x)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情深已故        
                
              
                            
                2021-02-05 01:21
              
            
            
                                                                       
As the dimension is fixed, I'd suggest you to go with a template which uses the dimension as a template param. Something like this:

template <typename R, std::size_t N> class ndpoint 
{
public:
  using elem_t=
    typename std::enable_if<std::is_arithmetic<R>::value, R>::type;

  static constexpr std::size_t DIM=N;

  ndpoint() = default;

  // e.g. for copying from a tuple
  template <typename... coordt> ndpoint(coordt... x) : elems_ {static_cast<R>(x)...} {
  }
  ndpoint(const ndpoint& other) : elems_() {
    *this=other;
  }

  template <typename PointType> ndpoint(const PointType& other) : elems_() {
    *this = other;
  }

  ndpoint& operator=(const ndpoint& other) {
    for(size_t i=0; i<N; i++) {
      this->elems_[i]=other.elems_[i];
    }
    return *this;
  }

  // this will allow you to assign from any source which defines the
  // [](size_t i) operator
  template <typename PointT> ndpoint& operator=(const PointT& other) {
    for(size_t i=0; i<N; i++) {
      this->elems_[i]=static_cast<R>(other[i]);
    }
  }

  const R& operator[](std::size_t i) const { return this->elems_[i]; }

  R& operator[](std::size_t i) { return this->elems_[i]; }

private:
  R elems_[N];
};


Then use a std::vector<ndpoint<...>> for a collection of points for best performance.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  半阙折子戏        
                
              
                            
                2021-02-05 01:31
              
            
            
                                                                       
For the said value of d (<10), defining Point as vector<int> will almost double the full memory usage by std::vector<Point> and will bring almost no advantage. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  没有蜡笔的小新        
                
              
                            
                2021-02-05 01:38
              
            
            
                                                                       
vector will store whatever your type contains in contiguous memory. So yes, if that's an array or a tuple, or probably even better, a custom type, it will avoid indirection.

Performance-wise, as always, you have to measure it. Don't speculate. At least as far as scanning is concerned.

However, there will definitely be a huge performance gain when you create those points in the first place, because you'll avoid unnecessary memory allocations for every vector that stores a point. And memory allocations are usually very expensive in C++.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复