Is `namedtuple` really as efficient in memory usage as tuples? My test says NO

前端 未结 2 1049
一整个雨季
一整个雨季 2021-02-12 21:31

It is stated in the Python documentation that one of the advantages of namedtuple is that it is as memory-efficient as tuples.

To validate this, I

相关标签:
2条回答
  • 2021-02-12 21:57

    Doing some investigation myself (with Python 3.6.6). I run into following conclusions:

    1. In all three cases (list of tuples, list of named tuples, list of dicts). sys.getsizeof returns size of the list, which stores only references anyway. So you get size: 81528056 in all three cases.

    2. Sizes of elementary types are:

      sys.getsizeof((1,2,3)) 72

      sys.getsizeof(point(x=1, y=2, z=3)) 72

      sys.getsizeof(dict(x=1, y=2, z=3)) 240

    3. timing is very bad for named tuple:
      list of tuples: 1.8s
      list of named tuples: 10s
      list of dicts: 4.6s

    4. Looking to system load I become suspicious about results from getsizeof. After measuring the footprint of the Ptyhon3 process I get:

      test_list = [(i, i+1, i+2) for i in range(10000000)]
      increase by: 1 745 564K
      that is about 175B per element

      test_list_n = [point(x=i, y=i+1, z=i+2) for i in range(10000000)]
      increase by: 1 830 740K
      that is about 183B per element

      test_list_n = [point(x=i, y=i+1, z=i+2) for i in range(10000000)]
      increase by: 2 717 492 K
      that is about 272B per element

    0 讨论(0)
  • 2021-02-12 22:11

    A simpler metric is to check the size of equivalent tuple and namedtuple objects. Given two roughly analogous objects:

    from collections import namedtuple
    import sys
    
    point = namedtuple('point', 'x y z')
    point1 = point(1, 2, 3)
    
    point2 = (1, 2, 3)
    

    Get the size of them in memory:

    >>> sys.getsizeof(point1)
    72
    
    >>> sys.getsizeof(point2)
    72
    

    They look the same to me...


    Taking this a step further to replicate your results, notice that if you create a list of identical tuples the way you're doing it, each tuple is the exact same object:

    >>> test_list = [(1,2,3) for _ in range(10000000)]
    >>> test_list[0] is test_list[-1]
    True
    

    So in your list of tuples, each index contains a reference the same object. There are not 10000000 tuples, there are 10000000 references to one tuple.

    On the other hand, your list of namedtuple objects actually does create 10000000 unique objects.

    A better apples-to-apples comparison would be to view the memory usage for

    >>> test_list = [(i, i+1, i+2) for i in range(10000000)]
    

    and:

    >>> test_list_n = [point(x=i, y=i+1, z=i+2) for i in range(10000000)]
    

    They have the same size:

    >>> sys.getsizeof(test_list)
    81528056
    
    >>> sys.getsizeof(test_list_n)
    81528056
    
    0 讨论(0)
提交回复
热议问题