More Pythonic way of counting things in a heavily nested defaultdict

My code currently has to count things in a heavily nested dict into another. I have items that need to be indexed by 3 values and then counted. So, before my loop, I initialize a nested defaultdict like so:

from collections import defaultdict

type_to_count_dic = defaultdict(
        lambda: defaultdict(
            lambda: defaultdict(int)
        )
    )

Which allows me to count the items within a tight loop like so:

for a in ...:
    for b in ...:
        for c in ...:
            type_to_count_dic[a][b][c] += 1

I feel like initializing all those defaultdicts feels a lot like making a type declaration in something like Java. Is there a more idiomatic/Pythonic way of doing something like this?

from collections import defaultdict

class _defaultdict(defaultdict):
    def __add__(self, other):
        return other

def CountTree():
    return _defaultdict(CountTree)

>>> t = CountTree()
>>> t['a']
defaultdict(<function CountTree at 0x9e5c3ac>, {})
>>> t['a']['b']['c'] += 1
>>> print t['a']['b']['c']
1

Since you are counting things, you should use a Counter for the inner-most dict:

import collections
defaultdict = collections.defaultdict
Counter = collections.Counter

x = defaultdict(lambda: defaultdict(Counter))

for a in A:
    for b in B:
        x[a][b].update(C)

Using a Counter will give you access to useful methods such as most_common.

Depending on what you intend to do with this dict, you may not need the deep nesting. Instead, you could use a tuple for the key. For example,

import collections
import itertools as IT

A = range(2)
B = 'XYZ'
C = 'abc'
x = collections.Counter(IT.product(A, B, C))
print(x)

yields

A = range(2)
B = 'XYZ'
C = 'abc'
x = collections.Counter(IT.product(A, B, C))
print(x)

yields

Counter({(0, 'X', 'c'): 1, (0, 'Z', 'a'): 1, (1, 'Z', 'a'): 1, (1, 'X', 'c'): 1, (1, 'Z', 'b'): 1, (0, 'X', 'b'): 1, (0, 'Y', 'a'): 1, (1, 'Y', 'a'): 1, (0, 'Z', 'c'): 1, (1, 'Z', 'c'): 1, (0, 'X', 'a'): 1, (0, 'Y', 'b'): 1, (1, 'X', 'a'): 1, (1, 'Y', 'b'): 1, (0, 'Z', 'b'): 1, (1, 'Y', 'c'): 1, (1, 'X', 'b'): 1, (0, 'Y', 'c'): 1})

I'm assuming you're only adding to each counter when certain conditions are met, or possibly adding different values depending on the conditions? Otherwise surely the value of each counter is always going to be 1?

That said, the simplest solution I can think of is to just create a single dict keyed on a tuple of the three loop values. For example something like this:

dict(((a,b,c),1) for a in A for b in B for c in C)

But as I said, this is just going to give you 1 in each counter. You'll need to replace the 1 in the expression above with some condition or function call that returns something more appropriate depending on the values of a, b and c.

I had a similar need, and created the following:

import json

from collections import defaultdict


class NestedDefaultDict(defaultdict):
    def __init__(self, depth, default=int, _root=True):
        self.root = _root
        self.depth = depth
        if depth > 1:
            cur_default = lambda: NestedDefaultDict(depth - 1,
                                                    default,
                                                    False)
        else:
            cur_default = default
        defaultdict.__init__(self, cur_default)

    def __repr__(self):
        if self.root:
            return "NestedDefaultDict(%d): {%s}" % (self.depth,
                                                    defaultdict.__repr__(self))
        else:
            return defaultdict.__repr__(self)


# Quick Example
core_data_type = lambda: [0] * 10
test = NestedDefaultDict(3, core_data_type)
test['hello']['world']['example'][5] += 100
print test
print json.dumps(test)

# Code without custom class.
test = defaultdict(lambda: defaultdict(lambda: defaultdict(core_data_type)))
test['hello']['world']['example'][5] += 100
print test
print json.dumps(test)

If I end up updating it I've also created a gist: https://gist.github.com/KyleJamesWalker/8573350

来源：https://stackoverflow.com/questions/16384174/more-pythonic-way-of-counting-things-in-a-heavily-nested-defaultdict

标签

python

defaultdict