My code currently has to count things in a heavily nested dict
into another. I have items that need to be indexed by 3 values and then counted. So, before my loop, I initialize a nested defaultdict
like so:
from collections import defaultdict
type_to_count_dic = defaultdict(
lambda: defaultdict(
lambda: defaultdict(int)
)
)
Which allows me to count the items within a tight loop like so:
for a in ...:
for b in ...:
for c in ...:
type_to_count_dic[a][b][c] += 1
I feel like initializing all those defaultdict
s feels a lot like making a type declaration in something like Java. Is there a more idiomatic/Pythonic way of doing something like this?
from collections import defaultdict
class _defaultdict(defaultdict):
def __add__(self, other):
return other
def CountTree():
return _defaultdict(CountTree)
>>> t = CountTree()
>>> t['a']
defaultdict(<function CountTree at 0x9e5c3ac>, {})
>>> t['a']['b']['c'] += 1
>>> print t['a']['b']['c']
1
Since you are counting things, you should use a Counter for the inner-most dict:
import collections
defaultdict = collections.defaultdict
Counter = collections.Counter
x = defaultdict(lambda: defaultdict(Counter))
for a in A:
for b in B:
x[a][b].update(C)
Using a Counter will give you access to useful methods such as most_common.
Depending on what you intend to do with this dict, you may not need the deep nesting. Instead, you could use a tuple for the key. For example,
import collections
import itertools as IT
A = range(2)
B = 'XYZ'
C = 'abc'
x = collections.Counter(IT.product(A, B, C))
print(x)
yields
A = range(2)
B = 'XYZ'
C = 'abc'
x = collections.Counter(IT.product(A, B, C))
print(x)
yields
Counter({(0, 'X', 'c'): 1, (0, 'Z', 'a'): 1, (1, 'Z', 'a'): 1, (1, 'X', 'c'): 1, (1, 'Z', 'b'): 1, (0, 'X', 'b'): 1, (0, 'Y', 'a'): 1, (1, 'Y', 'a'): 1, (0, 'Z', 'c'): 1, (1, 'Z', 'c'): 1, (0, 'X', 'a'): 1, (0, 'Y', 'b'): 1, (1, 'X', 'a'): 1, (1, 'Y', 'b'): 1, (0, 'Z', 'b'): 1, (1, 'Y', 'c'): 1, (1, 'X', 'b'): 1, (0, 'Y', 'c'): 1})
I'm assuming you're only adding to each counter when certain conditions are met, or possibly adding different values depending on the conditions? Otherwise surely the value of each counter is always going to be 1?
That said, the simplest solution I can think of is to just create a single dict keyed on a tuple of the three loop values. For example something like this:
dict(((a,b,c),1) for a in A for b in B for c in C)
But as I said, this is just going to give you 1 in each counter. You'll need to replace the 1 in the expression above with some condition or function call that returns something more appropriate depending on the values of a, b and c.
I had a similar need, and created the following:
import json
from collections import defaultdict
class NestedDefaultDict(defaultdict):
def __init__(self, depth, default=int, _root=True):
self.root = _root
self.depth = depth
if depth > 1:
cur_default = lambda: NestedDefaultDict(depth - 1,
default,
False)
else:
cur_default = default
defaultdict.__init__(self, cur_default)
def __repr__(self):
if self.root:
return "NestedDefaultDict(%d): {%s}" % (self.depth,
defaultdict.__repr__(self))
else:
return defaultdict.__repr__(self)
# Quick Example
core_data_type = lambda: [0] * 10
test = NestedDefaultDict(3, core_data_type)
test['hello']['world']['example'][5] += 100
print test
print json.dumps(test)
# Code without custom class.
test = defaultdict(lambda: defaultdict(lambda: defaultdict(core_data_type)))
test['hello']['world']['example'][5] += 100
print test
print json.dumps(test)
If I end up updating it I've also created a gist: https://gist.github.com/KyleJamesWalker/8573350
来源:https://stackoverflow.com/questions/16384174/more-pythonic-way-of-counting-things-in-a-heavily-nested-defaultdict