I\'m curious - why does the sys.getsizeof call return a smaller number for a list than the sum of its elements?
import sys
lst = [\"abcde\", \"fghij\", \"klmno\"
You are getting the size of the actual list object. As the list object stores pointers to objects its memory size is bound to be different (and lower) than the sum of its elements.
By analogy, it’s like getting the size of an array of pointers in C.
The memory of a numpy array a
can be obtained by a.nbytes
.
sys.getsizeof
shows "only the memory consumption directly attributed to the object [...], not the memory consumption of objects it refers to." (according to the documentation). In your case, it does not hold all the data. It can be seen with a.flags
which outputs:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : False
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
For the first array, it is instead:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
OWNDATA : True
WRITEABLE : True
ALIGNED : True
WRITEBACKIFCOPY : False
UPDATEIFCOPY : False
The OWNDATA
field being False
explains why sys.getsizeof
outputs only 128 bytes.
As per the documentation, sys.getsizeof
does the following:
Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.
Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.
So only very primitive types in built-in objects are you ever really going to get accurate results. Even for built-in container types, you usually need to use some sort of recursive function to find the "total" size of the container (list, dictionary, etc). Keep in mind, though, that a python list is really just a re-sizable array of pointers, so in a sense, it is an accurate number.
However, you are looking for something like this:
https://code.activestate.com/recipes/577504/
Also, note that:
>>> sys.getsizeof(npArrayList[0])
96
>>>
Every numpy object -or any object for that matter- has some overhead, and when you assign a np.array
as a list element, you create a new object, so really, the following only takes into account the memory of the array contents, and not the overhead of the whole object:
>>> npArrayList[0].nbytes
32