I\'ve read the examples in python docs, but still can\'t figure out what this method means. Can somebody help? Here are two examples from the python docs
>
The defaultdict tool is a container in the collections class of Python. It's similar to the usual dictionary (dict) container, but it has one difference: The value fields' data type is specified upon initialization.
For example:
from collections import defaultdict
d = defaultdict(list)
d['python'].append("awesome")
d['something-else'].append("not relevant")
d['python'].append("language")
for i in d.items():
print i
This prints:
('python', ['awesome', 'language'])
('something-else', ['not relevant'])
In short:
defaultdict(int)
- the argument int indicates that the values will be int type.
defaultdict(list)
- the argument list indicates that the values will be list type.
The behavior of defaultdict
can be easily mimicked using dict.setdefault
instead of d[key]
in every call.
In other words, the code:
from collections import defaultdict
d = defaultdict(list)
print(d['key']) # empty list []
d['key'].append(1) # adding constant 1 to the list
print(d['key']) # list containing the constant [1]
is equivalent to:
d = dict()
print(d.setdefault('key', list())) # empty list []
d.setdefault('key', list()).append(1) # adding constant 1 to the list
print(d.setdefault('key', list())) # list containing the constant [1]
The only difference is that, using defaultdict
, the list constructor is called only once, and using dict.setdefault
the list constructor is called more often (but the code may be rewriten to avoid this, if really needed).
Some may argue there is a performance consideration, but this topic is a minefield. This post shows there isn't a big performance gain in using defaultdict, for example.
IMO, defaultdict is a collection that adds more confusion than benefits to the code. Useless for me, but others may think different.
Without defaultdict
, you can probably assign new values to unseen keys but you cannot modify it. For example:
import collections
d = collections.defaultdict(int)
for i in range(10):
d[i] += i
print(d)
# Output: defaultdict(<class 'int'>, {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9})
import collections
d = {}
for i in range(10):
d[i] += i
print(d)
# Output: Traceback (most recent call last): File "python", line 4, in <module> KeyError: 0
The documentation and the explanation are pretty much self-explanatory:
http://docs.python.org/library/collections.html#collections.defaultdict
The type function(int/str etc.) passed as an argument is used to initialize a default value for any given key where the key is not present in the dict.
The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default up front when the container is initialized.
import collections
def default_factory():
return 'default value'
d = collections.defaultdict(default_factory, foo='bar')
print 'd:', d
print 'foo =>', d['foo']
print 'bar =>', d['bar']
This works well as long as it is appropriate for all keys to have the same default. It can be especially useful if the default is a type used for aggregating or accumulating values, such as a list, set, or even int. The standard library documentation includes several examples of using defaultdict this way.
$ python collections_defaultdict.py
d: defaultdict(<function default_factory at 0x100468c80>, {'foo': 'bar'})
foo => bar
bar => default value