I wish to store a large vector of d-dimensional points (d fixed and small: <10).
If I define a Point
as vector
, I think a v
If you define your Point
as having contiguous data storage (e.g. struct Point { int a; int b; int c; }
or using std::array
), then std::vector<Point>
will store the Point
s in contiguous memory locations, so your memory layout will be:
p0.a, p0.b, p0.c, p1.a, p1.b, p1.c, ..., p(N-1).a, p(N-1).b, p(N-1).c
On the other hand, if you define Point
as a vector<int>
, then a vector<Point>
has the layout of vector<vector<int>>
, which is not contiguous, as vector
stores pointers to dynamically allocated memory. So you have contiguity for single Point
s, but not for the whole structure.
The first solution is much more efficient than the second (as modern CPUs love accessing contiguous memory locations).
The only way to be 100% sure how your data is structured is to fully implement own memory handling..
However, there are many libraries that implement matrices and matrix operations that you can check out. Some have documented information about contiguous memory, reshape etc. (e.g. OpenCV Mat).
Note that in general you can not trust that an array of Points will be contiguous. This is due to alignment, allocation block header etc. For example consider
struct Point {
char x,y,z;
};
Point array_of_points[3];
Now if you try to 'reshape', that is, iterate between Point elements relaying on the fact that points are adjacent in the container - than it is most likely to fail:
(char *)(&array_of_points[0].z) != (char *)(&array_of_points[1].x)
As the dimension is fixed, I'd suggest you to go with a template which uses the dimension as a template param. Something like this:
template <typename R, std::size_t N> class ndpoint
{
public:
using elem_t=
typename std::enable_if<std::is_arithmetic<R>::value, R>::type;
static constexpr std::size_t DIM=N;
ndpoint() = default;
// e.g. for copying from a tuple
template <typename... coordt> ndpoint(coordt... x) : elems_ {static_cast<R>(x)...} {
}
ndpoint(const ndpoint& other) : elems_() {
*this=other;
}
template <typename PointType> ndpoint(const PointType& other) : elems_() {
*this = other;
}
ndpoint& operator=(const ndpoint& other) {
for(size_t i=0; i<N; i++) {
this->elems_[i]=other.elems_[i];
}
return *this;
}
// this will allow you to assign from any source which defines the
// [](size_t i) operator
template <typename PointT> ndpoint& operator=(const PointT& other) {
for(size_t i=0; i<N; i++) {
this->elems_[i]=static_cast<R>(other[i]);
}
}
const R& operator[](std::size_t i) const { return this->elems_[i]; }
R& operator[](std::size_t i) { return this->elems_[i]; }
private:
R elems_[N];
};
Then use a std::vector<ndpoint<...>>
for a collection of points for best performance.
For the said value of d
(<10), defining Point
as vector<int>
will almost double the full memory usage by std::vector<Point>
and will bring almost no advantage.
vector
will store whatever your type contains in contiguous memory. So yes, if that's an array
or a tuple
, or probably even better, a custom type, it will avoid indirection.
Performance-wise, as always, you have to measure it. Don't speculate. At least as far as scanning is concerned.
However, there will definitely be a huge performance gain when you create those points in the first place, because you'll avoid unnecessary memory allocations for every vector
that stores a point. And memory allocations are usually very expensive in C++.