nested dictionaries or tuples for key?

后端 未结 4 2161
自闭症患者
自闭症患者 2020-12-15 04:49

Suppose there is a structure like this:

{\'key1\' : { \'key2\' : { .... { \'keyn\' : \'value\' } ... } } }

Using python, I\'m trying to det

4条回答
  •  时光说笑
    2020-12-15 05:13

    Memory Consumption Testing

    I've written a small script to test it. It has some limitations though, the keys are made from integers linearly distirbuted (i.e. range(N)), my findings are the following.

    With a 3-level nesting, i.e. dict[a][b][c] vs dict[a,b,c] where each sub index goes from 0 to 99, I find the following:

    With large values (list(x for x in range(100))):

    > memory.py nested 
    Memory usage: 1049.0 MB
    > memory.py flat  
    Memory usage: 1149.7 MB
    

    and with small values ([]):

    > memory.py nested
    Memory usage: 134.1 MB
    > memory.py flat
    Memory usage: 234.8 MB
    

     Open questions

    • Why is this happening?
    • Would this change with different indices, e.g. non-consecutive ones?

    Script

    #!/usr/bin/env python3
    import resource
    import random
    import itertools
    import sys
    import copy
    from os.path import basename
    from collections import defaultdict
     
    # constants
    index_levels = [100, 100, 100]
    value_size   = 100 # try values like 0
    
    def memory_usage():
        return resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
    
    _object_mold = list(x for x in range(value_size)) # performance hack
    def create_object():
        return copy.copy(_object_mold)
    
    # automatically create nested dict
    # http://code.activestate.com/recipes/578057-recursive-defaultdict/
    f = lambda: defaultdict(f)
    my_dict = defaultdict(f)
    
    # options & usage
    try:
        dict_mode = sys.argv[1]
        if dict_mode not in ['flat', 'nested']: # ugly hack
            raise Error()
    except:
        print("Usage: {} [nested | flat]".format(basename(sys.argv[0])))
        exit()
     
    index_generator = [range(level) for level in index_levels]
    
    if dict_mode == "flat":
        for index in itertools.product(*index_generator):
            my_dict[index] = create_object()
    elif dict_mode == "nested":
        for index in itertools.product(*index_generator):
            sub_dict = my_dict
            for sub_index in index[:-1]:          # iterate into lowest dict
                sub_dict = sub_dict[sub_index]
            sub_dict[index[-1]] = create_object() # finally assign value
    
    print("Memory usage: {:.1f} MB".format(memory_usage() / 1024**2))
    

提交回复
热议问题