I am trying to understand the most memory efficient ways to store multiple images of the same size in Matlab.
Everything I write here is based on my little knowledge
cellArray
is second best as it only needs pointers for each array of 8 Bytes extra (ie, 100*8 bytes more).
This is not true. Each array has a "header" (a block of memory that specifies its type, size, etc.) The header in R2017a is 104 bytes (I think it's a little larger in the latest release). A cell array holds arrays, so the difference you see in your test with the 3D array:
26225600 - 26214400 = 11200
is
100 * (104 + 8) = 11200
The cell array is an array of pointers (8 bytes each) to arrays (104 bytes + whatever their data is).
For an image, which is a fairly large block of data, this 112 byte overhead is negligible. Other considerations, such as speed of access, become more important.
In MATLAB, two arrays can point to the same data. So doing something like
I = C{4};
doesn't create a copy of the array at C{4}
, instead the array I
references it. But if you use a 3D array, then:
I = A(:,:,4);
does make a copy because I
cannot reference a subset of another array, it must reference the whole thing.
Thus, using a 3D array, processing individual images requires a lot of copying back and forth of pixel data, which would not be necessary in a cell array.
A struct array is not a relevant data structure here, it would be equivalent to the cell array, except the indexing is more complicated (I don't know if this translates to runtime increase or not). That is, S(4).Image
is more involved than C{4}
. However, if you want to store additional information for each image, a struct array could be useful.
As you noticed, the struct array is only 64 bytes larger than the cell array. This stores the field name Image
. Again, not really worth while worrying about this amount of memory.
Here is a short summary of other ways to handle data in MATLAB, none of which seem reasonable to me:
Custom object type: here you are still dealing with normal arrays underneath, so there is no advantage or disadvantage here. These types are nice if you want to add methods specific to your images, but don't change the way memory is handled. They do seem to add some time overhead.
Use tall arrays, suitable for very large data that doesn't fit in memory, but I don't think anybody would consider doing image analysis with such an array.
Use memory-mapped files, useful to speed up file access, but doesn't really help in this case.
Talk to Java or Python from MATLAB, and have them do the memory handling. But then you might as well skip MATLAB altogether and go to a different environment.
So I really think that the two meaningful options for handling multiple images are either a cell array (or other heterogeneous container such as a struct or custom object), or a 3D array. I would not consider anything else.
In summary: Use a cell array.