The documentation says that the cartesian product function
the actual implementation does not build up intermediate results in memory.
How can
Well, it also says:
The nested loops cycle like an odometer with the rightmost element advancing on every iteration. This pattern creates a lexicographic ordering so that if the input’s iterables are sorted, the product tuples are emitted in sorted order.
This is pretty much how it works in the implementation (Modules/itertoolsmodule.c
)
Here is the state object:
typedef struct {
PyObject_HEAD
PyObject *pools; /* tuple of pool tuples */
Py_ssize_t *indices; /* one index per pool */
PyObject *result; /* most recently returned result tuple */
int stopped; /* set to 1 when the product iterator is exhausted */
} productobject;
And the next item is returned by the function product_next
, which uses this state and the algorithm described in the quote to generate the next state. See this answer to understand the memory requirements.
For general education, you can read about how to create generators with state from C extensions here.
Looking at the module's source code, itertools.product()
actually converts every argument to a tuple:
// product_new() in itertoolsmodule.c
for (i=0; i < nargs ; ++i) {
PyObject *item = PyTuple_GET_ITEM(args, i);
PyObject *pool = PySequence_Tuple(item); //<==== Call tuple(arg)
if (pool == NULL)
goto error;
PyTuple_SET_ITEM(pools, i, pool);
indices[i] = 0;
}
In other words, itertools.product()
's memory consumption appears to be linear in the size of the input arguments.