Flatten nested dictionaries with tuple keys

烈酒焚心 提交于 2021-02-10 17:47:34

问题


How to generalize this question to the case keys that may be tuples?

As a benefit even in the case of all string keys, if these are accumulated to a tuple, there's no need for ad-hoc separators (though JSON export is another matter):

one approach is to base it on this answer. I tried 2 versions:

def flatten_keys(d,handler,prefix=[]):
    return {handler(prefix,k) if prefix else k : v
        for kk, vv in d.items()
        for k, v in flatten_keys(vv, handler, kk).items()
        } if isinstance(d, dict) else { prefix : d }

where the tuple handlers are:

def tuple_handler_1(prefix,k):
    return tuple([prefix]+[k])

def tuple_handler_2(prefix,k):
    return tuple(flatten_container((prefix,k)))

Using the utility generator:

def flatten_container(container):
    for i in container:
        if isinstance(i, (list,tuple)):
            for j in flatten_container(i):
                yield j
        else:
            yield i

Consider one of the test dict's but using a tuple key ('hgf',1):

data =  {'abc':123, ('hgf',1):{'gh':432, 'yu':433}, 'gfd':902, 'xzxzxz':{"432":{'0b0b0b':231}, "43234":1321}}

Neither works as intended:

flatten_keys(data,tuple_handler_1)

{'abc': 123, (('hgf', 1), 'gh'): 432, (('hgf', 1), 'yu'): 433, 'gfd': 902, ('xzxzxz', ('432', '0b0b0b')): 231, ('xzxzxz', '43234'): 1321}

('xzxzxz', ('432', '0b0b0b')). is not flattened

And the 2nd flattens the input tuple key

flatten_keys(data,tuple_handler_2)

{'abc': 123, ('hgf', 1, 'gh'): 432, ('hgf', 1, 'yu'): 433, 'gfd': 902, ('xzxzxz', '432', '0b0b0b'): 231, ('xzxzxz', '43234'): 1321}

Is there an obvious modification of the flatten method that will correctly join strings and other hashables?

EDIT

As per comments below, a problem handling key-clash with this method is inherent the base case of strings keys, eg {'a_b':{'c':1}, 'a':{'b_c':2}}.

Thus each key path should be a be tuple even in for len 1 key paths to avoid key clash eg {((1,2),): 3, (1,2):4}}.


回答1:


Assuming you want the following input/output

# input
{'abc': 123,
 ('hgf', 1): {'gh': 432, 'yu': 433},
 'gfd': 902,
 'xzxzxz': {'432': {'0b0b0b': 231}, '43234': 1321}}

# output
{('abc',): 123,
 (('hgf', 1), 'gh'): 432,
 (('hgf', 1), 'yu'): 433,
 ('gfd',): 902,
 ('xzxzxz', '432', '0b0b0b'): 231,
 ('xzxzxz', '43234'): 1321}

One approach is to recurse on your dictionary until you find a non-dictionary value and pass down the current key as a tuple during the recursion.

def flatten_dict(deep_dict): 
    def do_flatten(deep_dict, current_key): 
        for key, value in deep_dict.items():
            # the key will be a flattened tuple
            # but the type of `key` is not touched
            new_key = current_key + (key,)
            # if we have a dict, we recurse
            if isinstance(value, dict): 
                yield from do_flatten(value, new_key) 
            else:
                yield (new_key, value) 
    return dict(do_flatten(deep_dict, ()))


来源:https://stackoverflow.com/questions/56877666/flatten-nested-dictionaries-with-tuple-keys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!