What should itertools.product() yield when supplied an empty list?

后端 未结 2 771
小鲜肉
小鲜肉 2021-01-04 08:28

I guess it\'s an academic question, but the second result does not make sense to me. Shouldn\'t it be as thoroughly empty as the first? What is the rationale for this behavi

相关标签:
2条回答
  • 2021-01-04 09:02

    As @sth already indicated, this behaviour is correct from a mathematical viewpoint. All you really need to convince yourself of is that list(itertools.product()) should have exactly one element, since once you know that it's clear what that element should be: it's got to be (for consistency) a tuple of length 0, and there's only one of those.

    But the number of elements of itertools.product(l1, l2, l3, ...) should just be the product of the lengths of l1, l2, l3, ... . So the number of elements of itertools.product() should be the size of the empty product, and there's no shortage of internet sources that should persuade you that the empty product is 1.

    I just wanted to point out that this is the correct practical definition as well as the correct mathematical one; that is, it's the definition that's most likely to 'just work' in boundary cases. For an example, suppose that you want to generate all strings of length n consisting of decimal digits, with the first digit nonzero. You might do something like:

    import itertools
    
    def decimal_strings(n):
        """Generate all digit strings of length n that don't start with 0."""
        for lead_digit in '123456789':
            for tail in itertools.product('0123456789', repeat=n-1):
                yield lead_digit + ''.join(tail)
    

    What should this produce when n = 1? Well, in that case, you end up calling itertools.product with an empty product (repeat = 0). If it returned nothing, then the body of the inner for loop above would never be executed, so decimal_strings(1) would be an empty iterator; almost certainly not what you want. But since itertools.product('0123456789', repeat=0) returns a single tuple, you get the expected result:

    >>> list(decimal_strings(1))
    ['1', '2', '3', '4', '5', '6', '7', '8', '9']
    

    (When n = 0, of course, this function correctly raises a ValueError.)

    So in short, the definition is mathematically sound, and more often that not it's also what you want. It's definitely not a Python bug!

    0 讨论(0)
  • 2021-01-04 09:12

    From a mathematical point of view the product over no elements should yield the neutral element of the operation product, whatever that is.

    For example on integers the neutral element of multiplication is 1, since 1 ⋅ a = a for all integers a. So an empty product of integers should be 1. When implementing a python function that returns the product of a list of numbers, this happens naturally:

    def iproduct(lst):
      result = 1
      for i in lst:
        result *= i
      return result
    

    For the correct result to be calculated with this algorithm, result needs to be initialized with 1. This leads to a return value of 1 when the function is called on an empty list.

    This return value is also very reasonable for the purpose of the function. With a good product function it shouldn't matter if you first concat two lists and then build the product of the elements, or if you first build the product of both individual lists and then multiply the results:

    iproduct(xs + ys) == iproduct(xs) * iproduct(ys)
    

    If xs or ys is empty that only works if iproduct([]) == 1.

    Now the more complicated product() on iterators. Here also, from a mathematical point of view, product([]) should return the neutral element of that operation, whatever that is. It is not [] since product([], xs) == [], while for the neutral elements product([], xs) == xs should hold. It turns out, though, that [()] also isn't a neutral element:

    >>> list(product([()], [1,2,3]))
    [((), 1), ((), 2), ((), 3)]
    

    In fact, product() is not really a very nice mathematical product at all, since this above equation doesn't hold:

    product(*(xs + ys)) != product(product(*xs), product(*ys))
    

    Each application of product generates an additional layer of tuples and there is no way around that, so there can't even be a real neutral element. [()] comes pretty close though, it doesn't add or remove any elements, it just adds an empty tuple to each.

    [()]would in fact be the neutral element of this slightly adapted product function that only operates on lists of tuples, but doesn't add additional tuple layers on each application:

    def tproduct(*xss):
      # the parameters have to be lists of tuples
      return (sum(rs, ()) for rs in product(*xss))
    

    For this function the above product equation holds:

    def tup(x): return (x,)
    txs = [map(tup, x) for x in xs]
    tys = [map(tup, y) for y in ys]
    tproduct(*(txs + tys)) == tproduct(tproduct(*txs), tproduct(*tys))
    

    With the additional preprocessing step of packing the input lists into tuples, tproduct() gives the same result as product(), but behaves nicer from a mathematical point of view. Also its neutral element is [()],

    So [()] makes some sense as the neutral element of this kind of list multiplication. Even if it doesn't exactly fit product() it is a good choice for this function since it for example allows to define tproduct() without the need to introduce a special case for empty input.

    0 讨论(0)
提交回复
热议问题