Most pythonic way to interleave two strings

前端 未结 14 1041
暗喜
暗喜 2020-11-27 14:34

What\'s the most pythonic way to mesh two strings together?

For example:

Input:

u = \'ABCDEFGHIJKLMNOPQRSTUVWXYZ\'
l = \'abcdefghijklmnopqrst         


        
相关标签:
14条回答
  • 2020-11-27 15:18

    For me, the most pythonic* way is the following which pretty much does the same thing but uses the + operator for concatenating the individual characters in each string:

    res = "".join(i + j for i, j in zip(u, l))
    print(res)
    # 'AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz'
    

    It is also faster than using two join() calls:

    In [5]: l1 = 'A' * 1000000; l2 = 'a' * 1000000
    
    In [6]: %timeit "".join("".join(item) for item in zip(l1, l2))
    1 loops, best of 3: 442 ms per loop
    
    In [7]: %timeit "".join(i + j for i, j in zip(l1, l2))
    1 loops, best of 3: 360 ms per loop
    

    Faster approaches exist, but they often obfuscate the code.

    Note: If the two input strings are not the same length then the longer one will be truncated as zip stops iterating at the end of the shorter string. In this case instead of zip one should use zip_longest (izip_longest in Python 2) from the itertools module to ensure that both strings are fully exhausted.


    *To take a quote from the Zen of Python: Readability counts.
    Pythonic = readability for me; i + j is just visually parsed more easily, at least for my eyes.

    0 讨论(0)
  • 2020-11-27 15:19

    I would use zip() to get a readable and easy way:

    result = ''
    for cha, chb in zip(u, l):
        result += '%s%s' % (cha, chb)
    
    print result
    # 'AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz'
    
    0 讨论(0)
  • 2020-11-27 15:20

    A lot of these suggestions assume the strings are of equal length. Maybe that covers all reasonable use cases, but at least to me it seems that you might want to accomodate strings of differing lengths too. Or am I the only one thinking the mesh should work a bit like this:

    u = "foobar"
    l = "baz"
    mesh(u,l) = "fboaozbar"
    

    One way to do this would be the following:

    def mesh(a,b):
        minlen = min(len(a),len(b))
        return "".join(["".join(x+y for x,y in zip(a,b)),a[minlen:],b[minlen:]])
    
    0 讨论(0)
  • 2020-11-27 15:21

    If you want the fastest way, you can combine itertools with operator.add:

    In [36]: from operator import add
    
    In [37]: from itertools import  starmap, izip
    
    In [38]: timeit "".join([i + j for i, j in uzip(l1, l2)])
    1 loops, best of 3: 142 ms per loop
    
    In [39]: timeit "".join(starmap(add, izip(l1,l2)))
    1 loops, best of 3: 117 ms per loop
    
    In [40]: timeit "".join(["".join(item) for item in zip(l1, l2)])
    1 loops, best of 3: 196 ms per loop
    
    In [41]:  "".join(starmap(add, izip(l1,l2))) ==  "".join([i + j   for i, j in izip(l1, l2)]) ==  "".join(["".join(item) for item in izip(l1, l2)])
    Out[42]: True
    

    But combining izip and chain.from_iterable is faster again

    In [2]: from itertools import  chain, izip
    
    In [3]: timeit "".join(chain.from_iterable(izip(l1, l2)))
    10 loops, best of 3: 98.7 ms per loop
    

    There is also a substantial difference between chain(* and chain.from_iterable(....

    In [5]: timeit "".join(chain(*izip(l1, l2)))
    1 loops, best of 3: 212 ms per loop
    

    There is no such thing as a generator with join, passing one is always going to be slower as python will first build a list using the content because it does two passes over the data, one to figure out the size needed and one to actually do the join which would not be possible using a generator:

    join.h:

     /* Here is the general case.  Do a pre-pass to figure out the total
      * amount of space we'll need (sz), and see whether all arguments are
      * bytes-like.
       */
    

    Also if you have different length strings and you don't want to lose data you can use izip_longest :

    In [22]: from itertools import izip_longest    
    In [23]: a,b = "hlo","elworld"
    
    In [24]:  "".join(chain.from_iterable(izip_longest(a, b,fillvalue="")))
    Out[24]: 'helloworld'
    

    For python 3 it is called zip_longest

    But for python2, veedrac's suggestion is by far the fastest:

    In [18]: %%timeit
    res = bytearray(len(u) * 2)
    res[::2] = u
    res[1::2] = l
    str(res)
       ....: 
    100 loops, best of 3: 2.68 ms per loop
    
    0 讨论(0)
  • 2020-11-27 15:21

    You could use iteration_utilities.roundrobin1

    u = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    l = 'abcdefghijklmnopqrstuvwxyz'
    
    from iteration_utilities import roundrobin
    ''.join(roundrobin(u, l))
    # returns 'AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz'
    

    or the ManyIterables class from the same package:

    from iteration_utilities import ManyIterables
    ManyIterables(u, l).roundrobin().as_string()
    # returns 'AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz'
    

    1 This is from a third-party library I have written: iteration_utilities.

    0 讨论(0)
  • 2020-11-27 15:23

    Just to add another, more basic approach:

    st = ""
    for char in u:
        st = "{0}{1}{2}".format( st, char, l[ u.index( char ) ] )
    
    0 讨论(0)
提交回复
热议问题