Performance of library itertools compared to python code

后端未结

关注

 2  1991

迷失自我 2021-01-13 14:49

As answer to my question Find the 1 based position to which two lists are the same I got the hint to use the C-library itertools to speed up things.

To verify I code

2条回答

野的像风 (楼主)

2021-01-13 15:07
I imagine the issue here is your test lists are tiny - meaning any difference is likely to be minimal, and the cost of creating the iterators is outweighing the gains they give.

In larger tests (where the performance is more likely to matter), the version using sum() will likely outperform the other version.

Also, there is the matter of style - the manual version is longer, and relies on iterating by index, making it less flexible as well.

I would argue the most readable solution would be something like this:
```
def while_equal(seq, other):
    for this, that in zip(seq, other):
        if this != that:
            return
        yield this

def match(seq, other):
    return sum(1 for _ in while_equal(seq, other))
```
Interestingly, on my system a slightly modified version of this:
```
def while_equal(seq, other):
    for this, that in zip(seq, other):
        if this != that:
            return
        yield 1

def match(seq, other):
    return sum(while_equal(seq, other))
```
Performs better than the pure loop version:
```
a = [0, 1, 2, 3, 4]
b = [0, 1, 2, 3, 4, 0]

import timeit

print(timeit.timeit('match_loop(a,b)', 'from __main__ import a, b, match_loop'))
print(timeit.timeit('match(a,b)', 'from __main__ import match, a, b'))
```
Giving:
```
1.3171300539979711
1.291257290984504
```
That said, if we improve the pure loop version to be more Pythonic:
```
def match_loop(seq, other):
    count = 0
    for this, that in zip(seq, other):
        if this != that:
            return count
        count += 1
    return count
```
This times (using the same method as above) at 0.8548871780512854 for me, significantly faster than any other method, while still being readable. This is probably due to looping by index in the original version, which is generally very slow. I, however, would go for the first version in this post, as I feel it's the most readable.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...