How do I profile memory usage in Python?

前端 未结 8 577
无人及你
无人及你 2020-11-22 09:50

I\'ve recently become interested in algorithms and have begun exploring them by writing a naive implementation and then optimizing it in various ways.

I\'m already f

相关标签:
8条回答
  • 2020-11-22 10:17

    Since the accepted answer and also the next highest voted answer have, in my opinion, some problems, I'd like to offer one more answer that is based closely on Ihor B.'s answer with some small but important modifications.

    This solution allows you to run profiling on either by wrapping a function call with the profile function and calling it, or by decorating your function/method with the @profile decorator.

    The first technique is useful when you want to profile some third-party code without messing with its source, whereas the second technique is a bit "cleaner" and works better when you are don't mind modifying the source of the function/method you want to profile.

    I've also modified the output, so that you get RSS, VMS, and shared memory. I don't care much about the "before" and "after" values, but only the delta, so I removed those (if you're comparing to Ihor B.'s answer).

    Profiling code

    # profile.py
    import time
    import os
    import psutil
    import inspect
    
    
    def elapsed_since(start):
        #return time.strftime("%H:%M:%S", time.gmtime(time.time() - start))
        elapsed = time.time() - start
        if elapsed < 1:
            return str(round(elapsed*1000,2)) + "ms"
        if elapsed < 60:
            return str(round(elapsed, 2)) + "s"
        if elapsed < 3600:
            return str(round(elapsed/60, 2)) + "min"
        else:
            return str(round(elapsed / 3600, 2)) + "hrs"
    
    
    def get_process_memory():
        process = psutil.Process(os.getpid())
        mi = process.memory_info()
        return mi.rss, mi.vms, mi.shared
    
    
    def format_bytes(bytes):
        if abs(bytes) < 1000:
            return str(bytes)+"B"
        elif abs(bytes) < 1e6:
            return str(round(bytes/1e3,2)) + "kB"
        elif abs(bytes) < 1e9:
            return str(round(bytes / 1e6, 2)) + "MB"
        else:
            return str(round(bytes / 1e9, 2)) + "GB"
    
    
    def profile(func, *args, **kwargs):
        def wrapper(*args, **kwargs):
            rss_before, vms_before, shared_before = get_process_memory()
            start = time.time()
            result = func(*args, **kwargs)
            elapsed_time = elapsed_since(start)
            rss_after, vms_after, shared_after = get_process_memory()
            print("Profiling: {:>20}  RSS: {:>8} | VMS: {:>8} | SHR {"
                  ":>8} | time: {:>8}"
                .format("<" + func.__name__ + ">",
                        format_bytes(rss_after - rss_before),
                        format_bytes(vms_after - vms_before),
                        format_bytes(shared_after - shared_before),
                        elapsed_time))
            return result
        if inspect.isfunction(func):
            return wrapper
        elif inspect.ismethod(func):
            return wrapper(*args,**kwargs)
    

    Example usage, assuming the above code is saved as profile.py:

    from profile import profile
    from time import sleep
    from sklearn import datasets # Just an example of 3rd party function call
    
    
    # Method 1
    run_profiling = profile(datasets.load_digits)
    data = run_profiling()
    
    # Method 2
    @profile
    def my_function():
        # do some stuff
        a_list = []
        for i in range(1,100000):
            a_list.append(i)
        return a_list
    
    
    res = my_function()
    

    This should result in output similar to the below:

    Profiling:        <load_digits>  RSS:   5.07MB | VMS:   4.91MB | SHR  73.73kB | time:  89.99ms
    Profiling:        <my_function>  RSS:   1.06MB | VMS:   1.35MB | SHR       0B | time:   8.43ms
    

    A couple of important final notes:

    1. Keep in mind, this method of profiling is only going to be approximate, since lots of other stuff might be happening on the machine. Due to garbage collection and other factors, the deltas might even be zero.
    2. For some unknown reason, very short function calls (e.g. 1 or 2 ms) show up with zero memory usage. I suspect this is some limitation of the hardware/OS (tested on basic laptop with Linux) on how often memory statistics are updated.
    3. To keep the examples simple, I didn't use any function arguments, but they should work as one would expect, i.e. profile(my_function, arg) to profile my_function(arg)
    0 讨论(0)
  • 2020-11-22 10:24

    If you only want to look at the memory usage of an object, (answer to other question)

    There is a module called Pympler which contains the asizeof module.

    Use as follows:

    from pympler import asizeof
    asizeof.asizeof(my_object)
    

    Unlike sys.getsizeof, it works for your self-created objects.

    >>> asizeof.asizeof(tuple('bcd'))
    200
    >>> asizeof.asizeof({'foo': 'bar', 'baz': 'bar'})
    400
    >>> asizeof.asizeof({})
    280
    >>> asizeof.asizeof({'foo':'bar'})
    360
    >>> asizeof.asizeof('foo')
    40
    >>> asizeof.asizeof(Bar())
    352
    >>> asizeof.asizeof(Bar().__dict__)
    280
    
    >>> help(asizeof.asizeof)
    Help on function asizeof in module pympler.asizeof:
    
    asizeof(*objs, **opts)
        Return the combined size in bytes of all objects passed as positional arguments.
    
    0 讨论(0)
提交回复
热议问题