In Python, how do I determine if an object is iterable?

前端 未结 21 2226
太阳男子
太阳男子 2020-11-22 00:35

Is there a method like isiterable? The only solution I have found so far is to call

hasattr(myObj, \'__iter__\')

But I am not

相关标签:
21条回答
  • 2020-11-22 00:41

    Since Python 3.5 you can use the typing module from the standard library for type related things:

    from typing import Iterable
    
    ...
    
    if isinstance(my_item, Iterable):
        print(True)
    
    0 讨论(0)
  • 2020-11-22 00:41
    try:
      #treat object as iterable
    except TypeError, e:
      #object is not actually iterable
    

    Don't run checks to see if your duck really is a duck to see if it is iterable or not, treat it as if it was and complain if it wasn't.

    0 讨论(0)
  • 2020-11-22 00:42

    The isiterable func at the following code returns True if object is iterable. if it's not iterable returns False

    def isiterable(object_):
        return hasattr(type(object_), "__iter__")
    

    example

    fruits = ("apple", "banana", "peach")
    isiterable(fruits) # returns True
    
    num = 345
    isiterable(num) # returns False
    
    isiterable(str) # returns False because str type is type class and it's not iterable.
    
    hello = "hello dude !"
    isiterable(hello) # returns True because as you know string objects are iterable
    
    0 讨论(0)
  • 2020-11-22 00:44

    In Python <= 2.5, you can't and shouldn't - iterable was an "informal" interface.

    But since Python 2.6 and 3.0 you can leverage the new ABC (abstract base class) infrastructure along with some builtin ABCs which are available in the collections module:

    from collections import Iterable
    
    class MyObject(object):
        pass
    
    mo = MyObject()
    print isinstance(mo, Iterable)
    Iterable.register(MyObject)
    print isinstance(mo, Iterable)
    
    print isinstance("abc", Iterable)
    

    Now, whether this is desirable or actually works, is just a matter of conventions. As you can see, you can register a non-iterable object as Iterable - and it will raise an exception at runtime. Hence, isinstance acquires a "new" meaning - it just checks for "declared" type compatibility, which is a good way to go in Python.

    On the other hand, if your object does not satisfy the interface you need, what are you going to do? Take the following example:

    from collections import Iterable
    from traceback import print_exc
    
    def check_and_raise(x):
        if not isinstance(x, Iterable):
            raise TypeError, "%s is not iterable" % x
        else:
            for i in x:
                print i
    
    def just_iter(x):
        for i in x:
            print i
    
    
    class NotIterable(object):
        pass
    
    if __name__ == "__main__":
        try:
            check_and_raise(5)
        except:
            print_exc()
            print
    
        try:
            just_iter(5)
        except:
            print_exc()
            print
    
        try:
            Iterable.register(NotIterable)
            ni = NotIterable()
            check_and_raise(ni)
        except:
            print_exc()
            print
    

    If the object doesn't satisfy what you expect, you just throw a TypeError, but if the proper ABC has been registered, your check is unuseful. On the contrary, if the __iter__ method is available Python will automatically recognize object of that class as being Iterable.

    So, if you just expect an iterable, iterate over it and forget it. On the other hand, if you need to do different things depending on input type, you might find the ABC infrastructure pretty useful.

    0 讨论(0)
  • 2020-11-22 00:46

    I'd like to shed a little bit more light on the interplay of iter, __iter__ and __getitem__ and what happens behind the curtains. Armed with that knowledge, you will be able to understand why the best you can do is

    try:
        iter(maybe_iterable)
        print('iteration will probably work')
    except TypeError:
        print('not iterable')
    

    I will list the facts first and then follow up with a quick reminder of what happens when you employ a for loop in python, followed by a discussion to illustrate the facts.

    Facts

    1. You can get an iterator from any object o by calling iter(o) if at least one of the following conditions holds true:

      a) o has an __iter__ method which returns an iterator object. An iterator is any object with an __iter__ and a __next__ (Python 2: next) method.

      b) o has a __getitem__ method.

    2. Checking for an instance of Iterable or Sequence, or checking for the attribute __iter__ is not enough.

    3. If an object o implements only __getitem__, but not __iter__, iter(o) will construct an iterator that tries to fetch items from o by integer index, starting at index 0. The iterator will catch any IndexError (but no other errors) that is raised and then raises StopIteration itself.

    4. In the most general sense, there's no way to check whether the iterator returned by iter is sane other than to try it out.

    5. If an object o implements __iter__, the iter function will make sure that the object returned by __iter__ is an iterator. There is no sanity check if an object only implements __getitem__.

    6. __iter__ wins. If an object o implements both __iter__ and __getitem__, iter(o) will call __iter__.

    7. If you want to make your own objects iterable, always implement the __iter__ method.

    for loops

    In order to follow along, you need an understanding of what happens when you employ a for loop in Python. Feel free to skip right to the next section if you already know.

    When you use for item in o for some iterable object o, Python calls iter(o) and expects an iterator object as the return value. An iterator is any object which implements a __next__ (or next in Python 2) method and an __iter__ method.

    By convention, the __iter__ method of an iterator should return the object itself (i.e. return self). Python then calls next on the iterator until StopIteration is raised. All of this happens implicitly, but the following demonstration makes it visible:

    import random
    
    class DemoIterable(object):
        def __iter__(self):
            print('__iter__ called')
            return DemoIterator()
    
    class DemoIterator(object):
        def __iter__(self):
            return self
    
        def __next__(self):
            print('__next__ called')
            r = random.randint(1, 10)
            if r == 5:
                print('raising StopIteration')
                raise StopIteration
            return r
    

    Iteration over a DemoIterable:

    >>> di = DemoIterable()
    >>> for x in di:
    ...     print(x)
    ...
    __iter__ called
    __next__ called
    9
    __next__ called
    8
    __next__ called
    10
    __next__ called
    3
    __next__ called
    10
    __next__ called
    raising StopIteration
    

    Discussion and illustrations

    On point 1 and 2: getting an iterator and unreliable checks

    Consider the following class:

    class BasicIterable(object):
        def __getitem__(self, item):
            if item == 3:
                raise IndexError
            return item
    

    Calling iter with an instance of BasicIterable will return an iterator without any problems because BasicIterable implements __getitem__.

    >>> b = BasicIterable()
    >>> iter(b)
    <iterator object at 0x7f1ab216e320>
    

    However, it is important to note that b does not have the __iter__ attribute and is not considered an instance of Iterable or Sequence:

    >>> from collections import Iterable, Sequence
    >>> hasattr(b, '__iter__')
    False
    >>> isinstance(b, Iterable)
    False
    >>> isinstance(b, Sequence)
    False
    

    This is why Fluent Python by Luciano Ramalho recommends calling iter and handling the potential TypeError as the most accurate way to check whether an object is iterable. Quoting directly from the book:

    As of Python 3.4, the most accurate way to check whether an object x is iterable is to call iter(x) and handle a TypeError exception if it isn’t. This is more accurate than using isinstance(x, abc.Iterable) , because iter(x) also considers the legacy __getitem__ method, while the Iterable ABC does not.

    On point 3: Iterating over objects which only provide __getitem__, but not __iter__

    Iterating over an instance of BasicIterable works as expected: Python constructs an iterator that tries to fetch items by index, starting at zero, until an IndexError is raised. The demo object's __getitem__ method simply returns the item which was supplied as the argument to __getitem__(self, item) by the iterator returned by iter.

    >>> b = BasicIterable()
    >>> it = iter(b)
    >>> next(it)
    0
    >>> next(it)
    1
    >>> next(it)
    2
    >>> next(it)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    StopIteration
    

    Note that the iterator raises StopIteration when it cannot return the next item and that the IndexError which is raised for item == 3 is handled internally. This is why looping over a BasicIterable with a for loop works as expected:

    >>> for x in b:
    ...     print(x)
    ...
    0
    1
    2
    

    Here's another example in order to drive home the concept of how the iterator returned by iter tries to access items by index. WrappedDict does not inherit from dict, which means instances won't have an __iter__ method.

    class WrappedDict(object): # note: no inheritance from dict!
        def __init__(self, dic):
            self._dict = dic
    
        def __getitem__(self, item):
            try:
                return self._dict[item] # delegate to dict.__getitem__
            except KeyError:
                raise IndexError
    

    Note that calls to __getitem__ are delegated to dict.__getitem__ for which the square bracket notation is simply a shorthand.

    >>> w = WrappedDict({-1: 'not printed',
    ...                   0: 'hi', 1: 'StackOverflow', 2: '!',
    ...                   4: 'not printed', 
    ...                   'x': 'not printed'})
    >>> for x in w:
    ...     print(x)
    ... 
    hi
    StackOverflow
    !
    

    On point 4 and 5: iter checks for an iterator when it calls __iter__:

    When iter(o) is called for an object o, iter will make sure that the return value of __iter__, if the method is present, is an iterator. This means that the returned object must implement __next__ (or next in Python 2) and __iter__. iter cannot perform any sanity checks for objects which only provide __getitem__, because it has no way to check whether the items of the object are accessible by integer index.

    class FailIterIterable(object):
        def __iter__(self):
            return object() # not an iterator
    
    class FailGetitemIterable(object):
        def __getitem__(self, item):
            raise Exception
    

    Note that constructing an iterator from FailIterIterable instances fails immediately, while constructing an iterator from FailGetItemIterable succeeds, but will throw an Exception on the first call to __next__.

    >>> fii = FailIterIterable()
    >>> iter(fii)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: iter() returned non-iterator of type 'object'
    >>>
    >>> fgi = FailGetitemIterable()
    >>> it = iter(fgi)
    >>> next(it)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/path/iterdemo.py", line 42, in __getitem__
        raise Exception
    Exception
    

    On point 6: __iter__ wins

    This one is straightforward. If an object implements __iter__ and __getitem__, iter will call __iter__. Consider the following class

    class IterWinsDemo(object):
        def __iter__(self):
            return iter(['__iter__', 'wins'])
    
        def __getitem__(self, item):
            return ['__getitem__', 'wins'][item]
    

    and the output when looping over an instance:

    >>> iwd = IterWinsDemo()
    >>> for x in iwd:
    ...     print(x)
    ...
    __iter__
    wins
    

    On point 7: your iterable classes should implement __iter__

    You might ask yourself why most builtin sequences like list implement an __iter__ method when __getitem__ would be sufficient.

    class WrappedList(object): # note: no inheritance from list!
        def __init__(self, lst):
            self._list = lst
    
        def __getitem__(self, item):
            return self._list[item]
    

    After all, iteration over instances of the class above, which delegates calls to __getitem__ to list.__getitem__ (using the square bracket notation), will work fine:

    >>> wl = WrappedList(['A', 'B', 'C'])
    >>> for x in wl:
    ...     print(x)
    ... 
    A
    B
    C
    

    The reasons your custom iterables should implement __iter__ are as follows:

    1. If you implement __iter__, instances will be considered iterables, and isinstance(o, collections.abc.Iterable) will return True.
    2. If the the object returned by __iter__ is not an iterator, iter will fail immediately and raise a TypeError.
    3. The special handling of __getitem__ exists for backwards compatibility reasons. Quoting again from Fluent Python:

    That is why any Python sequence is iterable: they all implement __getitem__ . In fact, the standard sequences also implement __iter__, and yours should too, because the special handling of __getitem__ exists for backward compatibility reasons and may be gone in the future (although it is not deprecated as I write this).

    0 讨论(0)
  • 2020-11-22 00:46

    pandas has a built-in function like that:

    from pandas.util.testing import isiterable
    
    0 讨论(0)
提交回复
热议问题