More Pythonic way of counting things in a heavily nested defaultdict

孤者浪人 提交于 2019-12-05 07:27:42
from collections import defaultdict

class _defaultdict(defaultdict):
    def __add__(self, other):
        return other

def CountTree():
    return _defaultdict(CountTree)

>>> t = CountTree()
>>> t['a']
defaultdict(<function CountTree at 0x9e5c3ac>, {})
>>> t['a']['b']['c'] += 1
>>> print t['a']['b']['c']
1

Since you are counting things, you should use a Counter for the inner-most dict:

import collections
defaultdict = collections.defaultdict
Counter = collections.Counter

x = defaultdict(lambda: defaultdict(Counter))

for a in A:
    for b in B:
        x[a][b].update(C)

Using a Counter will give you access to useful methods such as most_common.

Depending on what you intend to do with this dict, you may not need the deep nesting. Instead, you could use a tuple for the key. For example,

import collections
import itertools as IT

A = range(2)
B = 'XYZ'
C = 'abc'
x = collections.Counter(IT.product(A, B, C))
print(x)

yields

A = range(2)
B = 'XYZ'
C = 'abc'
x = collections.Counter(IT.product(A, B, C))
print(x)

yields

Counter({(0, 'X', 'c'): 1, (0, 'Z', 'a'): 1, (1, 'Z', 'a'): 1, (1, 'X', 'c'): 1, (1, 'Z', 'b'): 1, (0, 'X', 'b'): 1, (0, 'Y', 'a'): 1, (1, 'Y', 'a'): 1, (0, 'Z', 'c'): 1, (1, 'Z', 'c'): 1, (0, 'X', 'a'): 1, (0, 'Y', 'b'): 1, (1, 'X', 'a'): 1, (1, 'Y', 'b'): 1, (0, 'Z', 'b'): 1, (1, 'Y', 'c'): 1, (1, 'X', 'b'): 1, (0, 'Y', 'c'): 1})

I'm assuming you're only adding to each counter when certain conditions are met, or possibly adding different values depending on the conditions? Otherwise surely the value of each counter is always going to be 1?

That said, the simplest solution I can think of is to just create a single dict keyed on a tuple of the three loop values. For example something like this:

dict(((a,b,c),1) for a in A for b in B for c in C)

But as I said, this is just going to give you 1 in each counter. You'll need to replace the 1 in the expression above with some condition or function call that returns something more appropriate depending on the values of a, b and c.

I had a similar need, and created the following:

import json

from collections import defaultdict


class NestedDefaultDict(defaultdict):
    def __init__(self, depth, default=int, _root=True):
        self.root = _root
        self.depth = depth
        if depth > 1:
            cur_default = lambda: NestedDefaultDict(depth - 1,
                                                    default,
                                                    False)
        else:
            cur_default = default
        defaultdict.__init__(self, cur_default)

    def __repr__(self):
        if self.root:
            return "NestedDefaultDict(%d): {%s}" % (self.depth,
                                                    defaultdict.__repr__(self))
        else:
            return defaultdict.__repr__(self)


# Quick Example
core_data_type = lambda: [0] * 10
test = NestedDefaultDict(3, core_data_type)
test['hello']['world']['example'][5] += 100
print test
print json.dumps(test)

# Code without custom class.
test = defaultdict(lambda: defaultdict(lambda: defaultdict(core_data_type)))
test['hello']['world']['example'][5] += 100
print test
print json.dumps(test)

If I end up updating it I've also created a gist: https://gist.github.com/KyleJamesWalker/8573350

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!