Is there a middle ground between `zip` and `zip_longest`

前端未结

关注

 5  1020

Say I have these three lists:

a = [1, 2, 3, 4]
b = [5, 6, 7, 8, 9]
c = [10, 11, 12]

Is there a builtin function such that:


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北海茫月        
                
              
                            
                2021-01-12 12:29
              
            
            
                                                                       
Your output seems to be restricted by the output of the first iterator it1. So we could use it1 as is, and pad it2 with infinite None-yielding iterator, and zip them.

>>> from itertools import repeat,izip,chain
>>> somezip = lambda it1,it2: izip(it1,chain(it2,repeat(None)))

>>> list(somezip(a,b))
[(1, 5), (2, 6), (3, 7), (4, 8)]
>>> list(somezip(a,c))
[(1, 10), (2, 11), (3, 12), (4, None)]


repeat(None) creates iterator yielding None infinitely.

chain glues it2 and repeat(None).

izip will stop yielding as soon as it1 is exhausted.

The other solutions have some flaws (I've left remarks in the comments). They may work well, but with some input they may fail unexpectedly.



As glglgl suggested in the comments, this function would rather accept variable number of iterators in parameters.

So I updated the code to work this:

from itertools import repeat,izip,chain,imap
somezip = lambda it1,*its: izip(it1,*imap(chain,its,repeat(repeat(None))))


Test:

>>> print(list(somezip(a,b)))
    print(list(somezip(a,c)))
    print(list(somezip(b,a,c)))

[(1, 5), (2, 6), (3, 7), (4, 8)]
[(1, 10), (2, 11), (3, 12), (4, None)]
[(5, 1, 10), (6, 2, 11), (7, 3, 12), (8, 4, None), (9, None, None)]


I had to use imap here though there were no need to do so (as parameter are later unpacked, so plain map would do). The reason was that map don't accept iterators of different length, while imap stop while the smallest iterator is consumed.

So, imap applies chain to all the iterators except for the first one and chains each of them with repeat(None). To serve every iterator of its I used another repeat above repeat(None) (note, it may be very dangerous in the other projects, as all the objects which the outer repeat produces are the same object repeat(None), so in the end they all the chained iterators share it). Then I unpacked imap object to produce the parameters to izip, which returns the values until it1 is consumed (as chained its are now produce infinite sequence of values each).

Note that all the operations work in pure C, so there is no interpreter overhead involved.

To clarify how it works, I'm adding this elaboration:

def somezip(it1,*its): #from 0 to infinite iterators its
    # it1 -> a1,a2,a3,...,an
    # its -> (b1,b2,b3,...,bn),(c1,c2,c3,...,cn),...
    infinite_None = repeat(None) # None,None,None,...
    infinite_Nones = repeat(infinite_None) # infinite_None,infinite_None,... (share the same infinite_None)
    chained = imap(chain,its,infinite_Nones) # [(b1,b2,b3,...,bn,None,None,...),(c1,c2,c3,...,cn,None,None,...),...]
    return izip(it1,*chained)


And one-liner for it is just:

somezip = lambda it1,*its: izip(it1,*imap(chain,its,repeat(repeat(None))))

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  星月不相逢        
                
              
                            
                2021-01-12 12:34
              
            
            
                                                                       
define your own function:

In [64]: def myzip(*args):
    lenn=len(args[0])
    return list(izip_longest(*[islice(x,lenn) for x in args],fillvalue=None))
   ....: 

In [30]: myzip(a,b)
Out[30]: [(1, 5), (2, 6), (3, 7), (4, 8)]

In [31]: myzip(b,c)
Out[31]: [(5, 10), (6, 11), (7, 12), (8, None), (9, None)]

In [32]: myzip(a,c)
Out[32]: [(1, 10), (2, 11), (3, 12), (4, None)]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2021-01-12 12:37
              
            
            
                                                                       
import itertools as it

somezip = lambda *x: it.islice(it.izip_longest(*x), len(x[0]))



>>> list(somezip(a,b))
[(1, 5), (2, 6), (3, 7), (4, 8)]

>>> list(somezip(a,c))
[(1, 10), (2, 11), (3, 12), (4, None)]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  独厮守ぢ        
                
              
                            
                2021-01-12 12:37
              
            
            
                                                                       
This is longer than the others, but relatively easy to understand, if that matters. ;-)

a = [1, 2, 3, 4]
b = [5, 6, 7, 8, 9]
c = [10, 11, 12]
def g(n): return xrange(n)  # simple generator

def my_iter(iterable, fillvalue=None):
    for i in iterable: yield i
    while True: yield fillvalue

def somezip(*iterables, **kwds):
    fillvalue = kwds.get('fillvalue')
    iters = [my_iter(i, fillvalue) for i in iterables]
    return [tuple(next(it) for it in iters) for i in iterables[0]]

print 'somezip(a, b):', somezip(a, b)
print 'somezip(a, c):', somezip(a, c)
print 'somezip(a, g(2)):', somezip(a, g(2))
print 'somezip(g(2), a):', somezip(g(2),a)
print 'somezip(a, b, c):', somezip(a, b, c)
print 'somezip(a, b, c, g(2)):', somezip(a, b, c, g(2))
print 'somezip(g(2), a, b, c):', somezip(g(2), a, b, c)


Output:

somezip(a, b): [(1, 5), (2, 6), (3, 7), (4, 8)]
somezip(a, c): [(1, 10), (2, 11), (3, 12), (4, None)]
somezip(a, g(2)): [(1, 0), (2, 1), (3, None), (4, None)]
somezip(g(2), a): [(1, 1)]
somezip(a, b, c): [(1, 5, 10), (2, 6, 11), (3, 7, 12), (4, 8, None)]
somezip(a, b, c, g(2)): [(1, 5, 10, 0), (2, 6, 11, 1), (3, 7, 12, None), (4, 8, None, None)]
somezip(g(2), a, b, c): [(1, 1, 5, 10)]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2021-01-12 12:39
              
            
            
                                                                       
No there isn't, but you can easily combine the functionality of takewhile and izip_longest to achieve what you want

from itertools import takewhile, izip_longest
from operator import itemgetter
somezip = lambda *p: list(takewhile(itemgetter(0),izip_longest(*p)))


(In case the first-iterator may have items which evaluates to False, you can replace the itemgetter with a lambda expression - refer  @ovgolovin's comment)

somezip = lambda *p: list(takewhile(lambda e: not e[0] is None,izip_longest(*p)))


Examples

>>> from itertools import takewhile, izip_longest
>>> from operator import itemgetter
>>> a = [1, 2, 3, 4]
>>> b = [5, 6, 7, 8, 9]
>>> c = [10, 11, 12]
>>> somezip(a,b)
[(1, 5), (2, 6), (3, 7), (4, 8)]
>>> somezip(a,c)
[(1, 10), (2, 11), (3, 12), (4, None)]
>>> somezip(b,c)
[(5, 10), (6, 11), (7, 12), (8, None), (9, None)]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复