I\'m trying to learn Python, and I started to play with some code:
a = [3,4,5,6,7]
for b in a:
print(a)
a.pop(0)
<
kjaquier and Felix have talked about the iterator protocol, and we can see it in action in your case:
>>> L = [1, 2, 3]
>>> iterator = iter(L)
>>> iterator
<list_iterator object at 0x101231f28>
>>> next(iterator)
1
>>> L.pop()
3
>>> L
[1, 2]
>>> next(iterator)
2
>>> next(iterator)
Traceback (most recent call last):
File "<input>", line 1, in <module>
StopIteration
From this we can infer that list_iterator.__next__
has code that behaves something like:
if self.i < len(self.list):
return self.list[i]
raise StopIteration
It does not naively get the item. That would raise an IndexError
which would bubble to the top:
class FakeList(object):
def __iter__(self):
return self
def __next__(self):
raise IndexError
for i in FakeList(): # Raises `IndexError` immediately with a traceback and all
print(i)
Indeed, looking at listiter_next
in the CPython source (thanks Brian Rodriguez):
if (it->it_index < PyList_GET_SIZE(seq)) {
item = PyList_GET_ITEM(seq, it->it_index);
++it->it_index;
Py_INCREF(item);
return item;
}
Py_DECREF(seq);
it->it_seq = NULL;
return NULL;
Although I don't know how return NULL;
eventually translates into a StopIteration
.
We can easily see the sequence of events by using a little helper function foo
:
def foo():
for i in l:
l.pop()
and dis.dis(foo)
to see the Python byte-code generated. Snipping away the not-so-relevant opcodes, your loop does the following:
2 LOAD_GLOBAL 0 (l)
4 GET_ITER
>> 6 FOR_ITER 12 (to 20)
8 STORE_FAST 0 (i)
10 LOAD_GLOBAL 0 (l)
12 LOAD_ATTR 1 (pop)
14 CALL_FUNCTION 0
16 POP_TOP
18 JUMP_ABSOLUTE 6
That is, it get's the iter
for the given object (iter(l)
a specialized iterator object for lists) and loops until FOR_ITER
signals that it's time to stop. Adding the juicy parts, here's what FOR_ITER
does:
PyObject *next = (*iter->ob_type->tp_iternext)(iter);
which essentially is:
list_iterator.__next__()
this (finally*) goes through to listiter_next
which performs the index check as @Alex using the original sequence l
during the check.
if (it->it_index < PyList_GET_SIZE(seq))
when this fails, NULL
is returned which signals that the iteration has finished. In the meantime a StopIteration
exception is set which is silently suppressed in the FOR_ITER
op-code code:
if (!PyErr_ExceptionMatches(PyExc_StopIteration))
goto error;
else if (tstate->c_tracefunc != NULL)
call_exc_trace(tstate->c_tracefunc, tstate->c_traceobj, tstate, f);
PyErr_Clear(); /* My comment: Suppress it! */
so whether you change the list or not, the check in listiter_next
will ultimately fail and do the same thing.
*For anyone wondering, listiter_next
is a descriptor so there's a little function wrapping it. In this specific case, that function is wrap_next
which makes sure to set PyExc_StopIteration
as an exception when listiter_next
returns NULL
.
The reason why you shouldn't do that is precisely so you don't have to rely on how the iteration is implemented.
But back to the question. Lists in Python are array lists. They represent a continuous chunk of allocated memory, as opposed to linked lists in which each element in allocated independently. Thus, Python's lists, like arrays in C, are optimized for random access. In other words, the most efficient way to get from element n to element n+1 is by accessing to the element n+1 directly (by calling mylist.__getitem__(n+1)
or mylist[n+1]
).
So, the implementation of __next__
(the method called on each iteration) for lists is just like you would expect: the index of the current element is first set at 0 and then increased after each iteration.
In your code, if you also print b
, you will see that happening:
a = [3,4,5,6,7]
for b in a:
print a, b
a.pop(0)
Result :
[3, 4, 5, 6, 7] 3
[4, 5, 6, 7] 5
[5, 6, 7] 7
Because :
a[0] == 3
.a[1] == 5
.a[2] == 7
.len(a) < 3
)AFAIK, the for loop uses the iterator protocol. You can manually create and use the iterator as follows:
In [16]: a = [3,4,5,6,7]
...: it = iter(a)
...: while(True):
...: b = next(it)
...: print(b)
...: print(a)
...: a.pop(0)
...:
3
[3, 4, 5, 6, 7]
5
[4, 5, 6, 7]
7
[5, 6, 7]
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-16-116cdcc742c1> in <module>()
2 it = iter(a)
3 while(True):
----> 4 b = next(it)
5 print(b)
6 print(a)
The for loop stops if the iterator is exhausted (raises StopIteration
).