I\'m trying to understand the memory and other overhead implications that using numpy
lists would have for arrays of dtype
o
I'm going to answer your primary question, and leave the others (performance of transpose, etc.) out. So:
I'm trying to understand the memory and other overhead implications that using numpy lists would have … Just to clarify I'm interested in the case where the numpy array type is
object
not afloat
,double
orint
A Python list is an array of pointers to Python objects wrapping whatever actual values you've stored in it—plus some extra slack to allow it to be expanded on the fly efficiently. Let's call that slack 20%, just for the sake of easy computation. For example, an list of 10000 32-bit integers takes up, say, 96000 bytes for the array, plus around 240000 bytes for the Python integer objects, plus a small overhead for the list itself, say 80 bytes again.
A NumPy array is an array of whatever actual values you've stored in it. For example, an array of 10000 32-bit integers takes up 40000 bytes, plus a small overhead for the array itself, say 80 bytes. But when you're using dtype object
, each "actual value" is just a pointer to a Python object, just as with a list
.
So, the only real difference here is the slack: The array is going to use 320080 bytes, while the list is going to use 336080 bytes. Not a huge difference, but it can matter.
Also, does one become faster than the other in 2D vs ND or with the size along a given dimension.
Yes, the nested list will increase faster… but not by a huge amount.
A multidimensional array in numpy is stored as one giant array (in either C or Fortran striding order), so whether the shape is (10000,)
, (100, 100)
, or (10, 10, 10, 10)
, it's the same size. (The overhead may get a few bytes bigger to store more information about the striding, but if we're talking about, say, 256 bytes vs. 80 out of 320K, who cares?)
A nested list, on the other hand, has more lists, with slack and overhead at every level. For example, a list of 10 lists of 10 lists of 10 lists of integers has 1+10+100+1000 arrays of 12 pointers, and 1+10+100+1000 list headers.
So, the array is still using 320080 bytes, or maybe 320256, but the list is using 435536.
If you want to know more about how list
is implemented… well, that depends on the implementation you're using. But in CPython, the C API pretty much guarantees that it's going to be storing a contiguous array of PyObject *
, and the fact that appending is amortized constant time pretty much requires that it leave slack that grows proportionately. And you can see in the headers and the source that this is exactly what it does. (Also, keep in mind that the specific sizes you get from that source will generally depend on the platform you compile it on. Most importantly, because there are pointers all over the place, 64-bit platforms tend to have somewhere between 50-100% more overhead per object for most objects than 32-bit platforms.)