One-step initialization of defaultdict that appends to list?

It would be convenient if a defaultdict could be initialized along the following lines

d = defaultdict(list, (('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),
   ('b', 3)))

to produce

defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})

Instead, I get

defaultdict(<type 'list'>, {'a': 2, 'c': 3, 'b': 3, 'd': 4})

To get what I need, I end up having to do this:

d = defaultdict(list)
for x, y in (('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)):
    d[x].append(y)

This is IMO one step more than should be necessary, am I missing something here?

the behavior you describe would not be consistent with the defaultdicts other behaviors. Seems like what you want is FooDict such that

>>> f = FooDict()
>>> f['a'] = 1
>>> f['a'] = 2
>>> f['a']
[1, 2]

We can do that, but not with defaultdict; lets call it AppendDict

import collections

class AppendDict(collections.MutableMapping):
    def __init__(self, container=list, append=None, pairs=()):
        self.container = collections.defaultdict(container)
        self.append = append or list.append
        for key, value in pairs:
            self[key] = value

    def __setitem__(self, key, value):
        self.append(self.container[key], value)

    def __getitem__(self, key): return self.container[key]
    def __delitem__(self, key): del self.container[key]
    def __iter__(self): return iter(self.container)
    def __len__(self): return len(self.container)

What you're apparently missing is that defaultdict is a straightforward (not especially "magical") subclass of dict. All the first argument does is provide a factory function for missing keys. When you initialize a defaultdict, you're initializing a dict.

If you want to produce

defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})

you should be initializing it the way you would initialize any other dict whose values are lists:

d = defaultdict(list, (('a', [1, 2]), ('b', [2, 3]), ('c', [3]), ('d', [4])))

If your initial data has to be in the form of tuples whose 2nd element is always an integer, then just go with the for loop. You call it one extra step; I call it the clear and obvious way to do it.

Sorting and itertools.groupby go a long way:

>>> L = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)]
>>> L.sort(key=lambda t:t[0])
>>> d = defaultdict(list, [(tup[0], [t[1] for t in tup[1]]) for tup in itertools.groupby(L, key=lambda t: t[0])])
>>> d
defaultdict(<type 'list'>, {'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]})

To make this more of a one-liner:

L = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2), ('b', 3)]
d = defaultdict(list, [(tup[0], [t[1] for t in tup[1]]) for tup in itertools.groupby(sorted(L, key=operator.itemgetter(0)), key=lambda t: t[0])])

Hope this helps

I think most of this is a lot of smoke and mirrors to avoid a simple for loop:

di={}
for k,v in [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),('b', 3)]:
    di.setdefault(k,[]).append(v)
# di={'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}

If your goal is one line and you want abusive syntax that I cannot at all endorse or support you can use a side effect comprehension:

>>> li=[('a', 1), ('b', 2), ('c', 3), ('d', 4), ('a', 2),('b', 3)]
>>> di={};{di.setdefault(k[0],[]).append(k[1]) for k in li}
set([None])
>>> di
{'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}

If you really want to go overboard into the unreadable:

>>> {k1:[e for _,e in v1] for k1,v1 in {k:filter(lambda x: x[0]==k,li) for k,v in li}.items()}
{'a': [1, 2], 'c': [3], 'b': [2, 3], 'd': [4]}

You don't want to do that. Use the for loop Luke!

>>> kvs = [(1,2), (2,3), (1,3)]
>>> reduce(
...   lambda d,(k,v): d[k].append(v) or d,
...   kvs,
...   defaultdict(list))
defaultdict(<type 'list'>, {1: [2, 3], 2: [3]})

来源：https://stackoverflow.com/questions/18520825/one-step-initialization-of-defaultdict-that-appends-to-list

标签

python

defaultdict