I want to know how to get size of objects like a string, integer, etc. in Python.
Related question: How many bytes per element are there in a Python list (tuple)?
This can be more complicated than it looks depending on how you want to count things. For instance, if you have a list of ints, do you want the size of the list containing the references to the ints? (ie. list only, not what is contained in it), or do you want to include the actual data pointed to, in which case you need to deal with duplicate references, and how to prevent double-counting when two objects contain references to the same object.
You may want to take a look at one of the python memory profilers, such as pysizer to see if they meet your needs.
I use this trick... May won't be accurate on small objects, but I think it's much more accurate for a complex object (like pygame surface) rather than sys.getsizeof()
import pygame as pg
import os
import psutil
import time
process = psutil.Process(os.getpid())
pg.init()
vocab = ['hello', 'me', 'you', 'she', 'he', 'they', 'we',
'should', 'why?', 'necessarily', 'do', 'that']
font = pg.font.SysFont("monospace", 100, True)
dct = {}
newMem = process.memory_info().rss # don't mind this line
Str = f'store ' + f'Nothing \tsurface use about '.expandtabs(15) + \
f'0\t bytes'.expandtabs(9) # don't mind this assignment too
usedMem = process.memory_info().rss
for word in vocab:
dct[word] = font.render(word, True, pg.Color("#000000"))
time.sleep(0.1) # wait a moment
# get total used memory of this script:
newMem = process.memory_info().rss
Str = f'store ' + f'{word}\tsurface use about '.expandtabs(15) + \
f'{newMem - usedMem}\t bytes'.expandtabs(9)
print(Str)
usedMem = newMem
On my windows 10, python 3.7.3, the output is:
store hello surface use about 225280 bytes
store me surface use about 61440 bytes
store you surface use about 94208 bytes
store she surface use about 81920 bytes
store he surface use about 53248 bytes
store they surface use about 114688 bytes
store we surface use about 57344 bytes
store should surface use about 172032 bytes
store why? surface use about 110592 bytes
store necessarily surface use about 311296 bytes
store do surface use about 57344 bytes
store that surface use about 110592 bytes
Having run into this problem many times myself, I wrote up a small function (inspired by @aaron-hall's answer) & tests that does what I would have expected sys.getsizeof to do:
https://github.com/bosswissam/pysize
If you're interested in the backstory, here it is
EDIT: Attaching the code below for easy reference. To see the most up-to-date code, please check the github link.
import sys
def get_size(obj, seen=None):
"""Recursively finds size of objects"""
size = sys.getsizeof(obj)
if seen is None:
seen = set()
obj_id = id(obj)
if obj_id in seen:
return 0
# Important mark as seen *before* entering recursion to gracefully handle
# self-referential objects
seen.add(obj_id)
if isinstance(obj, dict):
size += sum([get_size(v, seen) for v in obj.values()])
size += sum([get_size(k, seen) for k in obj.keys()])
elif hasattr(obj, '__dict__'):
size += get_size(obj.__dict__, seen)
elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
size += sum([get_size(i, seen) for i in obj])
return size
Use sys.getsizeof() if you DON'T want to include sizes of linked (nested) objects.
However, if you want to count sub-objects nested in lists, dicts, sets, tuples - and usually THIS is what you're looking for - use the recursive deep sizeof() function as shown below:
import sys
def sizeof(obj):
size = sys.getsizeof(obj)
if isinstance(obj, dict): return size + sum(map(sizeof, obj.keys())) + sum(map(sizeof, obj.values()))
if isinstance(obj, (list, tuple, set, frozenset)): return size + sum(map(sizeof, obj))
return size
You can also find this function in the nifty toolbox, together with many other useful one-liners:
https://github.com/mwojnars/nifty/blob/master/util.py
The Pympler package's asizeof
module can do this.
Use as follows:
from pympler import asizeof
asizeof.asizeof(my_object)
Unlike sys.getsizeof
, it works for your self-created objects. It even works with numpy.
>>> asizeof.asizeof(tuple('bcd'))
200
>>> asizeof.asizeof({'foo': 'bar', 'baz': 'bar'})
400
>>> asizeof.asizeof({})
280
>>> asizeof.asizeof({'foo':'bar'})
360
>>> asizeof.asizeof('foo')
40
>>> asizeof.asizeof(Bar())
352
>>> asizeof.asizeof(Bar().__dict__)
280
>>> A = rand(10)
>>> B = rand(10000)
>>> asizeof.asizeof(A)
176
>>> asizeof.asizeof(B)
80096
As mentioned,
The (byte)code size of objects like classes, functions, methods, modules, etc. can be included by setting option code=True.
And if you need other view on live data, Pympler's
module muppy is used for on-line monitoring of a Python application and module Class Tracker provides off-line analysis of the lifetime of selected Python objects.
If you don't need the exact size of the object but roughly to know how big it is, one quick (and dirty) way is to let the program run, sleep for an extended period of time, and check the memory usage (ex: Mac's activity monitor) by this particular python process. This would be effective when you are trying to find the size of one single large object in a python process. For example, I recently wanted to check the memory usage of a new data structure and compare it with that of Python's set data structure. First I wrote the elements (words from a large public domain book) to a set, then checked the size of the process, and then did the same thing with the other data structure. I found out the Python process with a set is taking twice as much memory as the new data structure. Again, you wouldn't be able to exactly say the memory used by the process is equal to the size of the object. As the size of the object gets large, this becomes close as the memory consumed by the rest of the process becomes negligible compared to the size of the object you are trying to monitor.