How do I determine the size of an object in Python?

前端 未结 13 1484
半阙折子戏
半阙折子戏 2020-11-21 21:53

I want to know how to get size of objects like a string, integer, etc. in Python.

Related question: How many bytes per element are there in a Python list (tuple)?

相关标签:
13条回答
  • 2020-11-21 22:45

    This can be more complicated than it looks depending on how you want to count things. For instance, if you have a list of ints, do you want the size of the list containing the references to the ints? (ie. list only, not what is contained in it), or do you want to include the actual data pointed to, in which case you need to deal with duplicate references, and how to prevent double-counting when two objects contain references to the same object.

    You may want to take a look at one of the python memory profilers, such as pysizer to see if they meet your needs.

    0 讨论(0)
  • 2020-11-21 22:46

    I use this trick... May won't be accurate on small objects, but I think it's much more accurate for a complex object (like pygame surface) rather than sys.getsizeof()

    import pygame as pg
    import os
    import psutil
    import time
    
    
    process = psutil.Process(os.getpid())
    pg.init()    
    vocab = ['hello', 'me', 'you', 'she', 'he', 'they', 'we',
             'should', 'why?', 'necessarily', 'do', 'that']
    
    font = pg.font.SysFont("monospace", 100, True)
    
    dct = {}
    
    newMem = process.memory_info().rss  # don't mind this line
    Str = f'store ' + f'Nothing \tsurface use about '.expandtabs(15) + \
          f'0\t bytes'.expandtabs(9)  # don't mind this assignment too
    
    usedMem = process.memory_info().rss
    
    for word in vocab:
        dct[word] = font.render(word, True, pg.Color("#000000"))
    
        time.sleep(0.1)  # wait a moment
    
        # get total used memory of this script:
        newMem = process.memory_info().rss
        Str = f'store ' + f'{word}\tsurface use about '.expandtabs(15) + \
              f'{newMem - usedMem}\t bytes'.expandtabs(9)
    
        print(Str)
        usedMem = newMem
    

    On my windows 10, python 3.7.3, the output is:

    store hello          surface use about 225280    bytes
    store me             surface use about 61440     bytes
    store you            surface use about 94208     bytes
    store she            surface use about 81920     bytes
    store he             surface use about 53248     bytes
    store they           surface use about 114688    bytes
    store we             surface use about 57344     bytes
    store should         surface use about 172032    bytes
    store why?           surface use about 110592    bytes
    store necessarily    surface use about 311296    bytes
    store do             surface use about 57344     bytes
    store that           surface use about 110592    bytes
    
    0 讨论(0)
  • 2020-11-21 22:50

    Having run into this problem many times myself, I wrote up a small function (inspired by @aaron-hall's answer) & tests that does what I would have expected sys.getsizeof to do:

    https://github.com/bosswissam/pysize

    If you're interested in the backstory, here it is

    EDIT: Attaching the code below for easy reference. To see the most up-to-date code, please check the github link.

        import sys
    
        def get_size(obj, seen=None):
            """Recursively finds size of objects"""
            size = sys.getsizeof(obj)
            if seen is None:
                seen = set()
            obj_id = id(obj)
            if obj_id in seen:
                return 0
            # Important mark as seen *before* entering recursion to gracefully handle
            # self-referential objects
            seen.add(obj_id)
            if isinstance(obj, dict):
                size += sum([get_size(v, seen) for v in obj.values()])
                size += sum([get_size(k, seen) for k in obj.keys()])
            elif hasattr(obj, '__dict__'):
                size += get_size(obj.__dict__, seen)
            elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
                size += sum([get_size(i, seen) for i in obj])
            return size
    
    0 讨论(0)
  • 2020-11-21 22:53

    Use sys.getsizeof() if you DON'T want to include sizes of linked (nested) objects.

    However, if you want to count sub-objects nested in lists, dicts, sets, tuples - and usually THIS is what you're looking for - use the recursive deep sizeof() function as shown below:

    import sys
    def sizeof(obj):
        size = sys.getsizeof(obj)
        if isinstance(obj, dict): return size + sum(map(sizeof, obj.keys())) + sum(map(sizeof, obj.values()))
        if isinstance(obj, (list, tuple, set, frozenset)): return size + sum(map(sizeof, obj))
        return size
    

    You can also find this function in the nifty toolbox, together with many other useful one-liners:

    https://github.com/mwojnars/nifty/blob/master/util.py

    0 讨论(0)
  • 2020-11-21 22:54

    The Pympler package's asizeof module can do this.

    Use as follows:

    from pympler import asizeof
    asizeof.asizeof(my_object)
    

    Unlike sys.getsizeof, it works for your self-created objects. It even works with numpy.

    >>> asizeof.asizeof(tuple('bcd'))
    200
    >>> asizeof.asizeof({'foo': 'bar', 'baz': 'bar'})
    400
    >>> asizeof.asizeof({})
    280
    >>> asizeof.asizeof({'foo':'bar'})
    360
    >>> asizeof.asizeof('foo')
    40
    >>> asizeof.asizeof(Bar())
    352
    >>> asizeof.asizeof(Bar().__dict__)
    280
    >>> A = rand(10)
    >>> B = rand(10000)
    >>> asizeof.asizeof(A)
    176
    >>> asizeof.asizeof(B)
    80096
    

    As mentioned,

    The (byte)code size of objects like classes, functions, methods, modules, etc. can be included by setting option code=True.

    And if you need other view on live data, Pympler's

    module muppy is used for on-line monitoring of a Python application and module Class Tracker provides off-line analysis of the lifetime of selected Python objects.

    0 讨论(0)
  • 2020-11-21 22:56

    If you don't need the exact size of the object but roughly to know how big it is, one quick (and dirty) way is to let the program run, sleep for an extended period of time, and check the memory usage (ex: Mac's activity monitor) by this particular python process. This would be effective when you are trying to find the size of one single large object in a python process. For example, I recently wanted to check the memory usage of a new data structure and compare it with that of Python's set data structure. First I wrote the elements (words from a large public domain book) to a set, then checked the size of the process, and then did the same thing with the other data structure. I found out the Python process with a set is taking twice as much memory as the new data structure. Again, you wouldn't be able to exactly say the memory used by the process is equal to the size of the object. As the size of the object gets large, this becomes close as the memory consumed by the rest of the process becomes negligible compared to the size of the object you are trying to monitor.

    0 讨论(0)
提交回复
热议问题