More Pythonic way of counting things in a heavily nested defaultdict

China☆狼群 提交于 2019-12-07 02:26:45

问题


My code currently has to count things in a heavily nested dict into another. I have items that need to be indexed by 3 values and then counted. So, before my loop, I initialize a nested defaultdict like so:

from collections import defaultdict

type_to_count_dic = defaultdict(
        lambda: defaultdict(
            lambda: defaultdict(int)
        )
    )

Which allows me to count the items within a tight loop like so:

for a in ...:
    for b in ...:
        for c in ...:
            type_to_count_dic[a][b][c] += 1

I feel like initializing all those defaultdicts feels a lot like making a type declaration in something like Java. Is there a more idiomatic/Pythonic way of doing something like this?


回答1:


from collections import defaultdict

class _defaultdict(defaultdict):
    def __add__(self, other):
        return other

def CountTree():
    return _defaultdict(CountTree)

>>> t = CountTree()
>>> t['a']
defaultdict(<function CountTree at 0x9e5c3ac>, {})
>>> t['a']['b']['c'] += 1
>>> print t['a']['b']['c']
1



回答2:


Since you are counting things, you should use a Counter for the inner-most dict:

import collections
defaultdict = collections.defaultdict
Counter = collections.Counter

x = defaultdict(lambda: defaultdict(Counter))

for a in A:
    for b in B:
        x[a][b].update(C)

Using a Counter will give you access to useful methods such as most_common.

Depending on what you intend to do with this dict, you may not need the deep nesting. Instead, you could use a tuple for the key. For example,

import collections
import itertools as IT

A = range(2)
B = 'XYZ'
C = 'abc'
x = collections.Counter(IT.product(A, B, C))
print(x)

yields

A = range(2)
B = 'XYZ'
C = 'abc'
x = collections.Counter(IT.product(A, B, C))
print(x)

yields

Counter({(0, 'X', 'c'): 1, (0, 'Z', 'a'): 1, (1, 'Z', 'a'): 1, (1, 'X', 'c'): 1, (1, 'Z', 'b'): 1, (0, 'X', 'b'): 1, (0, 'Y', 'a'): 1, (1, 'Y', 'a'): 1, (0, 'Z', 'c'): 1, (1, 'Z', 'c'): 1, (0, 'X', 'a'): 1, (0, 'Y', 'b'): 1, (1, 'X', 'a'): 1, (1, 'Y', 'b'): 1, (0, 'Z', 'b'): 1, (1, 'Y', 'c'): 1, (1, 'X', 'b'): 1, (0, 'Y', 'c'): 1})



回答3:


I'm assuming you're only adding to each counter when certain conditions are met, or possibly adding different values depending on the conditions? Otherwise surely the value of each counter is always going to be 1?

That said, the simplest solution I can think of is to just create a single dict keyed on a tuple of the three loop values. For example something like this:

dict(((a,b,c),1) for a in A for b in B for c in C)

But as I said, this is just going to give you 1 in each counter. You'll need to replace the 1 in the expression above with some condition or function call that returns something more appropriate depending on the values of a, b and c.




回答4:


I had a similar need, and created the following:

import json

from collections import defaultdict


class NestedDefaultDict(defaultdict):
    def __init__(self, depth, default=int, _root=True):
        self.root = _root
        self.depth = depth
        if depth > 1:
            cur_default = lambda: NestedDefaultDict(depth - 1,
                                                    default,
                                                    False)
        else:
            cur_default = default
        defaultdict.__init__(self, cur_default)

    def __repr__(self):
        if self.root:
            return "NestedDefaultDict(%d): {%s}" % (self.depth,
                                                    defaultdict.__repr__(self))
        else:
            return defaultdict.__repr__(self)


# Quick Example
core_data_type = lambda: [0] * 10
test = NestedDefaultDict(3, core_data_type)
test['hello']['world']['example'][5] += 100
print test
print json.dumps(test)

# Code without custom class.
test = defaultdict(lambda: defaultdict(lambda: defaultdict(core_data_type)))
test['hello']['world']['example'][5] += 100
print test
print json.dumps(test)

If I end up updating it I've also created a gist: https://gist.github.com/KyleJamesWalker/8573350



来源:https://stackoverflow.com/questions/16384174/more-pythonic-way-of-counting-things-in-a-heavily-nested-defaultdict

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!