I would like to find out safe ways of implementing three dimensional arrays of integers in C++, using pointer arithmetic / dynamic memory allocation, or, alternatively using
Have a look at the Boost multi-dimensional array library. Here's an example (adapted from the Boost documentation):
#include "boost/multi_array.hpp"
int main() {
// Create a 3D array that is 20 x 30 x 4
int x = 20;
int y = 30;
int z = 4;
typedef boost::multi_array<int, 3> array_type;
typedef array_type::index index;
array_type my_array(boost::extents[x][y][z]);
// Assign values to the elements
int values = 0;
for (index i = 0; i != x; ++i) {
for (index j = 0; j != y; ++j) {
for (index k = 0; k != z; ++k) {
my_array[i][j][k] = values++;
}
}
}
}
It should be noted that, for all intents and purposes, you are dealing with only a 2D array, because the third (and least significant) dimension is known.
Using the STL or Boost are quite good approaches if you don't know beforehand how many entries you will have in each dimension of the array, because they will give you dynamic memory allocation, and I recommend either of these approaches if your data set is to remain largely static, or if it to mostly only receive new entries and not many deletions.
However, if you know something about your dataset beforehand, such as roughly how many items in total will be stored, or if the arrays are to be sparsely populated, you might be better off using some kind of hash/bucket function, and use the XYZ indices as your key. In this case, assuming no more than 8192 (13 bits) entries per dimension, you could get by with a 40-bit (5-byte) key. Or, assuming there are always 4 x Z entries, you would simply use a 26-bit XY key. This is one of the more efficient trade-offs between speed, memory usage, and dynamic allocation.
There are many advantages to using the STL to manage your memory over using new/delete. The choice of how to represent your data depends on how you plan to use it. One suggestion would be a class that hides the implementation decision and provides three dimensional get/set methods to a one dimensional STL vector.
If you really believe you need to create a custom 3d vector type, investigate Boost first.
// a class that does something in 3 dimensions
class MySimpleClass
{
public:
MySimpleClass(const size_t inWidth, const size_t inHeight, const size_t inDepth) :
mWidth(inWidth), mHeight(inHeight), mDepth(inDepth)
{
mArray.resize(mWidth * mHeight * mDepth);
}
// inline for speed
int Get(const size_t inX, const size_t inY, const size_t inZ) {
return mArray[(inZ * mWidth * mHeight) + (mY * mWidth) + mX];
}
void Set(const size_t inX, const size_t inY, const size_t inZ, const int inVal) {
return mArray[(inZ * mWidth * mHeight) + (mY * mWidth) + mX];
}
// doing something uniform with the data is easier if it's not a vector of vectors
void DoSomething()
{
std::transform(mArray.begin(), mArray.end(), mArray.begin(), MyUnaryFunc);
}
private:
// dimensions of data
size_t mWidth;
size_t mHeight;
size_t mDepth;
// data buffer
std::vector< int > mArray;
};
Each pair of square brackets is a dereferencing operation (when applied to a pointer). As an example, the following pairs of lines of code are equivalent:
x = myArray[4];
x = *(myArray+4);
x = myArray[2][7];
x = *((*(myArray+2))+7);
To use your suggested syntax you are simply dereferencing the value returned from the first dereference.
int*** myArray = (some allocation method, keep reading);
//
// All in one line:
int value = myArray[x][y][z];
//
// Separated to multiple steps:
int** deref1 = myArray[x];
int* deref2 = deref1[y];
int value = deref2[z];
To go about allocating this array, you simply need to recognise that you don't actually have a three-dimensional array of integers. You have an array of arrays of arrays of integers.
// Start by allocating an array for array of arrays
int*** myArray = new int**[X_MAXIMUM];
// Allocate an array for each element of the first array
for(int x = 0; x < X_MAXIMUM; ++x)
{
myArray[x] = new int*[Y_MAXIMUM];
// Allocate an array of integers for each element of this array
for(int y = 0; y < Y_MAXIMUM; ++y)
{
myArray[x][y] = new int[Z_MAXIMUM];
// Specify an initial value (if desired)
for(int z = 0; z < Z_MAXIMUM; ++z)
{
myArray[x][y][z] = -1;
}
}
}
Deallocating this array follows a similar process to allocating it:
for(int x = 0; x < X_MAXIMUM; ++x)
{
for(int y = 0; y < Y_MAXIMUM; ++y)
{
delete[] myArray[x][y];
}
delete[] myArray[x];
}
delete[] myArray;
With vectors:
std::vector< std::vector< std::vector< int > > > array3d;
Every element is accessible wit array3d[x][y][z] if the element was already added. (e.g. via push_back)
Below is a straightforward way to create 3D arrays using C or C++ in one chunk of memory for each array. No need to use BOOST (even if it's nice), or to split allocation between lines with multiple indirection (this is quite bad as it usually gives big performance penalty when accessing data and it fragments memory).
The only thing to understand is that there is no such thing as multidimensional arrays, just arrays of arrays (of arrays). The innermost index being the farthest in memory.
#include <stdio.h>
#include <stdlib.h>
int main(){
{
// C Style Static 3D Arrays
int a[10][20][30];
a[9][19][29] = 10;
printf("a[9][19][29]=%d\n", a[9][19][29]);
}
{
// C Style dynamic 3D Arrays
int (*a)[20][30];
a = (int (*)[20][30])malloc(10*20*30*sizeof(int));
a[9][19][29] = 10;
printf("a[9][19][29]=%d\n", a[9][19][29]);
free(a);
}
{
// C++ Style dynamic 3D Arrays
int (*a)[20][30];
a = new int[10][20][30];
a[9][19][29] = 10;
printf("a[9][19][29]=%d\n", a[9][19][29]);
delete [] a;
}
}
For your actual problem, as there potentially is two unknown dimensions, there is a problem with my proposal at it allow only one unknown dimension. There is several ways to manage that.
The good news is that using variables now works with C, it is called variable length arrays. You look here for details.
int x = 100;
int y = 200;
int z = 30;
{
// C Style Static 3D Arrays
int a[x][y][z];
a[99][199][29] = 10;
printf("a[99][199][29]=%d\n", a[99][199][29]);
}
{
// C Style dynamic 3D Arrays
int (*a)[y][z];
a = (int (*)[y][z])malloc(x*y*z*sizeof(int));
a[99][199][29] = 10;
printf("a[99][199][29]=%d\n", a[99][199][29]);
free(a);
}
If using C++ the simplest way is probably to use operator overloading to stick with array syntax:
{
class ThreeDArray {
class InnerTwoDArray {
int * data;
size_t y;
size_t z;
public:
InnerTwoDArray(int * data, size_t y, size_t z)
: data(data), y(y), z(z) {}
public:
int * operator [](size_t y){ return data + y*z; }
};
int * data;
size_t x;
size_t y;
size_t z;
public:
ThreeDArray(size_t x, size_t y, size_t z) : x(x), y(y), z(z) {
data = (int*)malloc(x*y*z*sizeof data);
}
~ThreeDArray(){ free(data); }
InnerTwoDArray operator [](size_t x){
return InnerTwoDArray(data + x*y*z, y, z);
}
};
ThreeDArray a(x, y, z);
a[99][199][29] = 10;
printf("a[99][199][29]=%d\n", a[99][199][29]);
}
The above code has some indirection cost for accessing InnerTwoDArray (but a good compiler can probably optimize it away) but uses only one memory chunk for array allocated on heap. Which is usually the most efficient choice.
Obviously even if the above code is still simple and straightforward, STL or BOOST does it well, hence no need to reinvent the wheel. I still believe it is interesting to know it can be easily done.