Python's itertools product memory consumption

后端未结

关注

 2  518

The documentation says that the cartesian product function

the actual implementation does not build up intermediate results in memory.

How can

相关标签:

2条回答

傲寒

2021-02-19 04:57
Well, it also says:

The nested loops cycle like an odometer with the rightmost element advancing on every iteration. This pattern creates a lexicographic ordering so that if the input’s iterables are sorted, the product tuples are emitted in sorted order.

This is pretty much how it works in the implementation (Modules/itertoolsmodule.c)

Here is the state object:
```
typedef struct {
    PyObject_HEAD
    PyObject *pools;       /* tuple of pool tuples */
    Py_ssize_t *indices;   /* one index per pool */
    PyObject *result;      /* most recently returned result tuple */
    int stopped;           /* set to 1 when the product iterator is exhausted */
} productobject;
```
And the next item is returned by the function product_next, which uses this state and the algorithm described in the quote to generate the next state. See this answer to understand the memory requirements.

For general education, you can read about how to create generators with state from C extensions here.
0 讨论(0)
发布评论:

提交评论
- 加载中...

情深已故

2021-02-19 05:01

Looking at the module's source code, itertools.product() actually converts every argument to a tuple:

// product_new() in itertoolsmodule.c
for (i=0; i < nargs ; ++i) {
    PyObject *item = PyTuple_GET_ITEM(args, i);
    PyObject *pool = PySequence_Tuple(item); //<==== Call tuple(arg)
    if (pool == NULL)
        goto error;
    PyTuple_SET_ITEM(pools, i, pool);
    indices[i] = 0;
}

In other words, itertools.product()'s memory consumption appears to be linear in the size of the input arguments.

0 讨论(0)