How does collections.defaultdict work?

前端 未结 15 2021
离开以前 2020-11-22 12:50

I\'ve read the examples in python docs, but still can\'t figure out what this method means. Can somebody help? Here are two examples from the python docs


  • 2020-11-22 12:56

    Well, defaultdict can also raise keyerror in the following case:

        from collections import defaultdict
        d = defaultdict()
        print(d[3]) #raises keyerror

    Always remember to give argument to the defaultdict like defaultdict(int).

    0 讨论(0)
  • 2020-11-22 12:57

    I think its best used in place of a switch case statement. Imagine if we have a switch case statement as below:

    option = 1
    switch(option) {
        case 1: print '1st option'
        case 2: print '2nd option'
        case 3: print '3rd option'
        default: return 'No such option'

    There is no switch case statements available in python. We can achieve the same by using defaultdict.

    from collections import defaultdict
    def default_value(): return "Default Value"
    dd = defaultdict(default_value)
    dd[1] = '1st option'
    dd[2] = '2nd option'
    dd[3] = '3rd option'

    It prints:

    Default Value
    Default Value
    3rd option

    In the above snippet dd has no keys 4 or 5 and hence it prints out a default value which we have configured in a helper function. This is quite nicer than a raw dictionary where a KeyError is thrown if key is not present. From this it is evident that defaultdict more like a switch case statement where we can avoid a complicated if-elif-elif-else blocks.

    One more good example that impressed me a lot from this site is:

    >>> from collections import defaultdict
    >>> food_list = 'spam spam spam spam spam spam eggs spam'.split()
    >>> food_count = defaultdict(int) # default value of int is 0
    >>> for food in food_list:
    ...     food_count[food] += 1 # increment element's value by 1
    defaultdict(<type 'int'>, {'eggs': 1, 'spam': 7})

    If we try to access any items other than eggs and spam we will get a count of 0.

    0 讨论(0)
  • 2020-11-22 12:58


    "The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default(value to be returned) up front when the container is initialized."

    as defined by Doug Hellmann in The Python Standard Library by Example

    How to use defaultdict

    Import defaultdict

    >>> from collections import defaultdict

    Initialize defaultdict

    Initialize it by passing

    callable as its first argument(mandatory)

    >>> d_int = defaultdict(int)
    >>> d_list = defaultdict(list)
    >>> def foo():
    ...     return 'default value'
    >>> d_foo = defaultdict(foo)
    >>> d_int
    defaultdict(<type 'int'>, {})
    >>> d_list
    defaultdict(<type 'list'>, {})
    >>> d_foo
    defaultdict(<function foo at 0x7f34a0a69578>, {})

    **kwargs as its second argument(optional)

    >>> d_int = defaultdict(int, a=10, b=12, c=13)
    >>> d_int
    defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})


    >>> kwargs = {'a':10,'b':12,'c':13}
    >>> d_int = defaultdict(int, **kwargs)
    >>> d_int
    defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})

    How does it works

    As is a child class of standard dictionary, it can perform all the same functions.

    But in case of passing an unknown key it returns the default value instead of error. For ex:

    >>> d_int['a']
    >>> d_int['d']
    >>> d_int
    defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12, 'd': 0})

    In case you want to change default value overwrite default_factory:

    >>> d_int.default_factory = lambda: 1
    >>> d_int['e']
    >>> d_int
    defaultdict(<function <lambda> at 0x7f34a0a91578>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0})


    >>> def foo():
    ...     return 2
    >>> d_int.default_factory = foo
    >>> d_int['f']
    >>> d_int
    defaultdict(<function foo at 0x7f34a0a0a140>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0, 'f': 2})

    Examples in the Question

    Example 1

    As int has been passed as default_factory, any unknown key will return 0 by default.

    Now as the string is passed in the loop, it will increase the count of those alphabets in d.

    >>> s = 'mississippi'
    >>> d = defaultdict(int)
    >>> d.default_factory
    <type 'int'>
    >>> for k in s:
    ...     d[k] += 1
    >>> d.items()
    [('i', 4), ('p', 2), ('s', 4), ('m', 1)]
    >>> d
    defaultdict(<type 'int'>, {'i': 4, 'p': 2, 's': 4, 'm': 1})

    Example 2

    As a list has been passed as default_factory, any unknown(non-existent) key will return [ ](ie. list) by default.

    Now as the list of tuples is passed in the loop, it will append the value in the d[color]

    >>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
    >>> d = defaultdict(list)
    >>> d.default_factory
    <type 'list'>
    >>> for k, v in s:
    ...     d[k].append(v)
    >>> d.items()
    [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
    >>> d
    defaultdict(<type 'list'>, {'blue': [2, 4], 'red': [1], 'yellow': [1, 3]})
    0 讨论(0)
  • 2020-11-22 13:00

    Usually, a Python dictionary throws a KeyError if you try to get an item with a key that is not currently in the dictionary. The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using int(), which will return the integer object 0. For the second example, default items are created using list(), which returns a new empty list object.

    0 讨论(0)
  • 2020-11-22 13:01

    Dictionaries are a convenient way to store data for later retrieval by name (key). Keys must be unique, immutable objects, and are typically strings. The values in a dictionary can be anything. For many applications, the values are simple types such as integers and strings.

    It gets more interesting when the values in a dictionary are collections (lists, dicts, etc.) In this case, the value (an empty list or dict) must be initialized the first time a given key is used. While this is relatively easy to do manually, the defaultdict type automates and simplifies these kinds of operations. A defaultdict works exactly like a normal dict, but it is initialized with a function (“default factory”) that takes no arguments and provides the default value for a nonexistent key.

    A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.

    from collections import defaultdict
    ice_cream = defaultdict(lambda: 'Vanilla')
    ice_cream['Sarah'] = 'Chunky Monkey'
    ice_cream['Abdul'] = 'Butter Pecan'
    >>>Chunky Monkey

    Here is another example on How using defaultdict, we can reduce complexity

    from collections import defaultdict
    # Time complexity O(n^2)
    def delete_nth_naive(array, n):
        ans = []
        for num in array:
            if ans.count(num) < n:
        return ans
    # Time Complexity O(n), using hash tables.
    def delete_nth(array,n):
        result = []
        counts = defaultdict(int)
        for i in array:
            if counts[i] < n:
                counts[i] += 1
        return result
    x = [1,2,3,1,2,1,2,3]
    print(delete_nth(x, n=2))
    print(delete_nth_naive(x, n=2))

    In conclusion, whenever you need a dictionary, and each element’s value should start with a default value, use a defaultdict.

    0 讨论(0)
  • 2020-11-22 13:02

    Since the question is about "how it works", some readers may want to see more nuts and bolts. Specifically, the method in question is the __missing__(key) method. See: .

    More concretely, this answer shows how to make use of __missing__(key) in a practical way:

    To clarify what 'callable' means, here's an interactive session (from 2.7.6 but should work in v3 too):

    >>> x = int
    >>> x
    <type 'int'>
    >>> y = int(5)
    >>> y
    >>> z = x(5)
    >>> z
    >>> from collections import defaultdict
    >>> dd = defaultdict(int)
    >>> dd
    defaultdict(<type 'int'>, {})
    >>> dd = defaultdict(x)
    >>> dd
    defaultdict(<type 'int'>, {})
    >>> dd['a']
    >>> dd
    defaultdict(<type 'int'>, {'a': 0})

    That was the most typical use of defaultdict (except for the pointless use of the x variable). You can do the same thing with 0 as the explicit default value, but not with a simple value:

    >>> dd2 = defaultdict(0)
    Traceback (most recent call last):
      File "<pyshell#7>", line 1, in <module>
        dd2 = defaultdict(0)
    TypeError: first argument must be callable

    Instead, the following works because it passes in a simple function (it creates on the fly a nameless function which takes no arguments and always returns 0):

    >>> dd2 = defaultdict(lambda: 0)
    >>> dd2
    defaultdict(<function <lambda> at 0x02C4C130>, {})
    >>> dd2['a']
    >>> dd2
    defaultdict(<function <lambda> at 0x02C4C130>, {'a': 0})

    And with a different default value:

    >>> dd3 = defaultdict(lambda: 1)
    >>> dd3
    defaultdict(<function <lambda> at 0x02C4C170>, {})
    >>> dd3['a']
    >>> dd3
    defaultdict(<function <lambda> at 0x02C4C170>, {'a': 1})
    0 讨论(0)