Despite reading up on it, I still dont quite understand how __iter__
works. What would be a simple explaination?
I\'ve seen def__iter__(self): retu
An iterator needs to define two methods: __iter__()
and __next__()
(next()
in python2). Usually, the object itself defines the __next__()
or next()
method, so it just returns itself as the iterator. This creates an iterable that is also itself an iterator. These methods are used by for
and in
statements.
Python 3 docs: docs.python.org/3/library/stdtypes.html#iterator-types
Python 2 docs: docs.python.org/2/library/stdtypes.html#iterator-types
A class supporting the __iter__ method will return an iterator object instance: an object supporting the next() method. This object will be usuable in the statements "for" and "in".
In Python, an iterator is any object that supports the iterator protocol. Part of that protocol is that the object must have an __iter__()
method that returns the iterator object. I suppose this gives you some flexibility so that an object can pass on the iterator responsibilities to an internal class, or create some special object. In any case, the __iter__()
method usually has only one line and that line is often simply return self
The other part of the protocol is the next()
method, and this is where the real work is done. This method has to figure out or create or get the next thing, and return it. It may need to keep track of where it is so that the next time it is called, it really does return the next thing.
Once you have an object that returns the next thing in a sequence, you can collapse a for loop that looks like this:
myname = "Fredericus"
x = []
for i in [1,2,3,4,5,6,7,8,9,10]:
x.append(myname[i-1])
i = i + 1 # get the next i
print x
into this:
myname = "Fredericus"
x = [myname[i] for i in range(10)]
print x
Notice that there is nowhere where we have code that gets the next value of i because range(10) is an object that FOLLOWS the iterator protocol, and the list comprehension is a construct that USES the iterator protocol.
You can also USE the iterator protocol directly. For instance, when writing scripts to process CSV files, I often write this:
mydata = csv.reader(open('stuff.csv')
mydata.next()
for row in mydata:
# do something with the row.
I am using the iterator directly by calling next()
to skip the header row, then using it indirectly via the builtin in
operator in the for
statement.
As simply as I can put it:
__iter__
defines a method on a class which will return an iterator (an object that successively yields the next item contained by your object).
The iterator object that __iter__()
returns can be pretty much any object, as long as it defines a next()
method.
The next
method will be called by statements like for ... in ...
to yield the next item, and next()
should raise the StopIteration
exception when there are no more items.
What's great about this is it lets you define how your object is iterated, and __iter__
provides a common interface that every other python function knows how to work with.
The specs for def __iter__(self):
are: it returns an iterator. So, if self
is an iterator, return self
is clearly appropriate.
"Being an iterator" means "having a __next__(self)
method" (in Python 3; in Python 2, the name of the method in question is unfortunately plain next
instead, clearly a name design glitch for a special method).
In Python 2.6 and higher, the best way to implement an iterator is generally to use the appropriate abstract base class from the collections
standard library module -- in Python 2.6, the code might be (remember to call the method __next__
instead in Python 3):
import collections
class infinite23s(collections.Iterator):
def next(self): return 23
an instance of this class will return infinitely many copies of 23
when iterated on (like itertools.repeat(23)
) so the loop must be terminated otherwise. The point is that subclassing collections.Iterator
adds the right __iter__
method on your behalf -- not a big deal here, but a good general principle (avoid repetitive, boilerplate code like iterators' standard one-line __iter__
-- in repetition, there's no added value and a lot of subtracted value!-).