How to tell the difference between an iterator and an iterable?

前端 未结 4 582
醉梦人生
醉梦人生 2021-02-01 08:13

In Python the interface of an iterable is a subset of the iterator interface. This has the advantage that in many cases they can be treated in the same way. However, there is an

相关标签:
4条回答
  • 2021-02-01 08:33
    'iterator' if obj is iter(obj) else 'iterable'
    
    0 讨论(0)
  • 2021-02-01 08:47
    import itertools
    
    def process(iterable):
        work_iter, backup_iter= itertools.tee(iterable)
    
        for item in work_iter:
            # bla bla
            if need_to_startover():
                for another_item in backup_iter:
    

    That damn time machine that Raymond borrowed from Guido…

    0 讨论(0)
  • 2021-02-01 08:50

    Because of Python's duck typing,

    Any object is iterable if it defines the next() and __iter__() method returns itself.

    If the object itself doesnt have the next() method, the __iter__() can return any object, that has a next() method

    You could refer this question to see Iterability in Python

    0 讨论(0)
  • 2021-02-01 08:52

    However, there is an important semantic difference between the two...

    Not really semantic or important. They're both iterable -- they both work with a for statement.

    The difference is for example important when one wants to loop multiple times.

    When does this ever come up? You'll have to be more specific. In the rare cases when you need to make two passes through an iterable collection, there are often better algorithms.

    For example, let's say you're processing a list. You can iterate through a list all you want. Why did you get tangled up with an iterator instead of the iterable? Okay that didn't work.

    Okay, here's one. You're reading a file in two passes, and you need to know how to reset the iterable. In this case, it's a file, and seek is required; or a close and a reopen. That feels icky. You can readlines to get a list which allows two passes with no complexity. So that's not necessary.

    Wait, what if we have a file so big we can't read it all into memory? And, for obscure reasons, we can't seek, either. What then?

    Now, we're down to the nitty-gritty of two passes. On the first pass, we accumulated something. An index or a summary or something. An index has all the file's data. A summary, often, is a restructuring of the data. With a small change from "summary" to "restructure", we've preserved the file's data in the new structure. In both cases, we don't need the file -- we can use the index or the summary.

    All "two-pass" algorithms can be changed to one pass of the original iterator or iterable and a second pass of a different data structure.

    This is neither LYBL or EAFP. This is algorithm design. You don't need to reset an iterator -- YAGNI.


    Edit

    Here's an example of an iterator/iterable issue. It's simply a poorly-designed algorithm.

    it = iter(xrange(3))
    for i in it: print i,; #prints 1,2,3 
    for i in it: print i,; #prints nothing
    

    This is trivially fixed.

    it = range(3)
    for i in it: print i
    for i in it: print i
    

    The "multiple times in parallel" is trivially fixed. Write an API that requires an iterable. And when someone refuses to read the API documentation or refuses to follow it after having read it, their stuff breaks. As it should.

    The "nice to safeguard against the case were a user provides only an iterator when multiple passes are needed" are both examples of insane people writing code that breaks our simple API.

    If someone is insane enough to read most (but not all of the API doc) and provide an iterator when an iterable was required, you need to find this person and teach them (1) how to read all the API documentation and (2) follow the API documentation.

    The "safeguard" issue isn't very realistic. These crazy programmers are remarkably rare. And in the few cases when it does arise, you know who they are and can help them.


    Edit 2

    The "we have to read the same structure multiple times" algorithms are a fundamental problem.

    Do not do this.

    for element in someBigIterable:
        function1( element )
    for element in someBigIterable:
        function2( element )
    ...
    

    Do this, instead.

    for element in someBigIterable:
        function1( element )
        function2( element )
        ...
    

    Or, consider something like this.

    for element in someBigIterable:
        for f in ( function1, function2, function3, ... ):
            f( element )
    

    In most cases, this kind of "pivot" of your algorithms results in a program that might be easier to optimize and might be a net improvement in performance.

    0 讨论(0)
提交回复
热议问题