Performance Advantages to Iterators?

后端 未结 9 1753
梦毁少年i
梦毁少年i 2020-11-28 10:45

What (if any) performance advantages are offered by using iterators. It seems like the \'Right Way\' to solve many problems, but does it create faster/more memory-conscious

相关标签:
9条回答
  • 2020-11-28 11:04

    An iterator is simply an object that provides methods to allow traversing through a collection. You could traverse all of the elements of an array or all the nodes of a tree with the same interface. Trees and arrays are very different data structures and require different methods to traverse .. but with an iterator you can loop through all elements in the same way.

    For one type of collection, there may also be different ways to traverse it and a single collection could have multiple iterators .. You could have a depth-first iterator or a breadth-first iterator traversing a tree structure and returning the nodes in different orders. Iterators are not intended for performance ... but typically for providing a consistent interface for traversing structures.

    0 讨论(0)
  • 2020-11-28 11:10

    There's actually a very good mail on the python mailing list about this: Iterators vs Lists. It's a bit dated (from 2003), but as far as I know, it's still valid.

    Here's the summary:

    For small datasets, iterator and list based approaches have similar performance. For larger datasets, iterators save both time and space.

    What I would draw from it is this: iterators are to be preferred over loading data into a list if possible. But unless you have a big dataset, don't contort your code to make something that should fit in a list to work with an iterator.

    0 讨论(0)
  • 2020-11-28 11:14

    My inference from many answers above is "Use list to code. If necessary, re-factor using iterators" The difference is not apparent unless you have a large dataset.

    Another thing to note is that, even when using lists often, the dataset we are operating upon is progressively smaller and smaller.

    0 讨论(0)
  • 2020-11-28 11:17

    There is one answer that I think confuses the concept of generator and iterator a little bit. So I decided to give a try answering this question with a metaphor example.

    I'm working at a kitchen, my boss give me a task of adding up the weight of 10 (or 100 or a million) breads. I have a scale and a calculator( magic tricks of my algorithmn). Below are the iterable object, generator, iterator, approach difference:

    1. Iterable object: Each bread is stored in one box(memory), I weigh the first (or the 0th) bread, put down its weight, and put the bread back to the box, then go to the next one, weigh it and put it back, on and on, etc, etc. In the end, I got the overall weight, and the 10 (100 or million) breads are still there in their boxes.

    2. Generator: There are not enough boxes to store all these bread, So I asked for the help of a baker(the generator), he makes the first bread, give it to me, I weigh it, put the result down, throw that bread away and ask him for another one,on and on, etc, until I got the last bread (or maybe the baker runs out of flour). In the end, I have the result, none of the bread is there. But who cares, my boss only asks me to weigh these breads, he didn't say I cannot throw them away ( what a brilliant busboy).

    3. Iterator: I ask someone(iterator) to help me move first bread onto the scale, I weigh it, put the result down. This someone would go grab the next one for measuring, on and on, etc. I actually have no idea if someone (iterator) get the bread from a box or from a baker. Eventually, I got the overall weight, it doesn't matter to me.

    Anyway, to sum up:

    1. Iterable object need some memory to store data to start with.In the end, data is still there.

    2. Generator wouldn't need memory to store data to start with, it generates data on the go.

    3. Iterator is a channel between algorithm and its data. This data may already been there and stored in memory or may be generated on the go by a generator. In the first case, that memory would be freed bit by bit as iterator keeps iterating. So I agree a lot with above answer that iterator is good because of its abstraction that enables isolation of algorithm and data.

    python doesn't exactly work like this. Hope it help clarify a little bit.

    0 讨论(0)
  • 2020-11-28 11:17

    Slightly off-topic but adds more weight to usage of lists over iterators in general: with iterators it's easier to have side effects, consider this:

    def foo(arg: Iterable[str]):
      print(list(arg)) # side effect: arg is exhausted at this point
      ...
    

    You can say testing should catch this but sometimes it doesn't. Lists do not have this problem since they're stateless (in the sense of iteration).

    0 讨论(0)
  • 2020-11-28 11:21

    Iterators will be faster and have better memory efficiency. Just think of an example of range(1000) vs xrange(1000). (This has been changed in 3.0, range is now an iterator.) With range you pre-build your list, but xrange is an iterator and yields the next item when needed instead.

    The performance difference isn't great on small things, but as soon as you start cranking them out getting larger and larger sets of information you'll notice it quite quickly. Also, not just having to generate and then step through, you will be consuming extra memory for your pre-built item whereas with the iterator, only 1 item at a time gets made.

    0 讨论(0)
提交回复
热议问题