What exactly are views in Python3.1? They seem to behave in a similar manner as that of iterators and they can be materialized into lists too. How are iterators and views differ
How are iterators and views different?
I'll rephrase the question as "what's the difference between an iterable objects and an iterator"?
An iterable is an object that can be iterated over (e.g. used in a for
loop).
An iterator is an object that can be called with the next()
function, that is it implements the .next()
method in Python2 and .__next__()
in python3. An iterator is often used to wrap an iterable and return each item of interest. All iterators are iterable, but the reverse is not necessarily true (all iterables are not iterators).
Views are iterable objects, not iterators.
Let's look at some code to see the distinction (Python 3):
The "What's new in Python 3" document is very specific about which functions return iterators. map()
, filter()
, and zip()
definitely return an iterator, whereas dict.items()
, dict.values()
, dict.keys()
are said to return a view object. As for range()
, although the description of what it returns exactly lacks precision, we know it's not an iterator.
Using map()
to double all numbers in a list
m = map(lambda x: x*2, [0,1,2])
hasattr(m, '__next__')
# True
next(m)
# 0
next(m)
# 2
next(m)
# 4
next(m)
# StopIteration ...
Using filter()
to extract all odd numbers
f = filter(lambda x: x%2==1, [0,1,2,3,4,5,6])
hasattr(f, '__next__')
# True
next(f)
# 1
next(f)
# 3
next(f)
# 5
next(f)
# StopIteration ...
Trying to use range()
in the same manner to produce a sequence of number
r = range(3)
hasattr(r, '__next__')
# False
next(r)
# TypeError: 'range' object is not an iterator
But it's an iterable, so we should be able to wrap it with an iterator
it = iter(r)
next(it)
# 0
next(it)
# 1
next(it)
# 2
next(it)
# StopIteration ...
dict.items()
as well as dict.keys()
and dict.values()
also do not return iterators in Python 3
d = {'a': 0, 'b': 1, 'c': 2}
items = d.items()
hasattr(items, '__next__')
# False
it = iter(items)
next(it)
# ('b', 1)
next(it)
# ('c', 2)
next(it)
# ('a', 0)
An iterator can only be used in a single for
loop, whereas an iterable can be used repeatedly in subsequent for
loops. Each time an iterable is used in this context it implicitely returns a new iterator (from its __iter__()
method). The following custom class demonstrates this by outputting the memory id
of both the list object and the returning iterator object:
class mylist(list):
def __iter__(self, *a, **kw):
print('id of iterable is still:', id(self))
rv = super().__iter__(*a, **kw)
print('id of iterator is now:', id(rv))
return rv
l = mylist('abc')
A for
loop can use the iterable object and will implicitly get an iterator
for c in l:
print(c)
# id of iterable is still: 139696242511768
# id of iterator is now: 139696242308880
# a
# b
# c
A subsequent for
loop can use the same iterable object, but will get another iterator
for c in l:
print(c)
# id of iterable is still: 139696242511768
# id of iterator is now: 139696242445616
# a
# b
# c
We can also obtain an iterator explicitly
it = iter(l)
# id of iterable is still: 139696242511768
# id of iterator is now: 139696242463688
but it can then only be used once
for c in it:
print(c)
# a
# b
# c
for c in it:
print(c)
for c in it:
print(c)