Is there a simple way to remove multiple spaces in a string?

后端 未结 29 1532
星月不相逢
星月不相逢 2020-11-22 08:17

Suppose this string:

The   fox jumped   over    the log.

Turning into:



        
29条回答
  •  醉酒成梦
    2020-11-22 08:53

    Because @pythonlarry asked here are the missing generator based versions

    The groupby join is easy. Groupby will group elements consecutive with same key. And return pairs of keys and list of elements for each group. So when the key is an space an space is returne else the entire group.

    from itertools import groupby
    def group_join(string):
      return ''.join(' ' if chr==' ' else ''.join(times) for chr,times in groupby(string))
    

    The group by variant is simple but very slow. So now for the generator variant. Here we consume an iterator, the string, and yield all chars except chars that follow an char.

    def generator_join_generator(string):
      last=False
      for c in string:
        if c==' ':
          if not last:
            last=True
            yield ' '
        else:
          last=False
        yield c
    
    def generator_join(string):
      return ''.join(generator_join_generator(string))
    

    So i meassured the timings with some other lorem ipsum.

    • while_replace 0.015868543065153062
    • re_replace 0.22579886706080288
    • proper_join 0.40058281796518713
    • group_join 5.53206754301209
    • generator_join 1.6673167790286243

    With Hello and World separated by 64KB of spaces

    • while_replace 2.991308711003512
    • re_replace 0.08232860406860709
    • proper_join 6.294375243945979
    • group_join 2.4320066600339487
    • generator_join 6.329648651066236

    Not forget the original sentence

    • while_replace 0.002160938922315836
    • re_replace 0.008620491018518806
    • proper_join 0.005650000995956361
    • group_join 0.028368217987008393
    • generator_join 0.009435956948436797

    Interesting here for nearly space only strings group join is not that worse Timing showing always median from seven runs of a thousand times each.

提交回复
热议问题