How to replace repeated instances of a character with a single instance of that character in python

前端 未结 11 1292
北海茫月
北海茫月 2020-12-31 00:29

I want to replace repeated instances of the \"*\" character within a string with a single instance of \"*\". For example if the string is \"*

11条回答
  •  时光说笑
    2020-12-31 00:39

    I timed all the methods in the current answers (with Python 3.7.2, macOS High Sierra).

    b() was the best overall, c() was best when no matches are made.

    def b(text):
        re.sub(r"\*\*+", "*", text)
    
    # aka squeeze()
    def c(text):
        while "*" * 2 in text:
            text = text.replace("*" * 2, "*")
        return text
    

    Input 1, no repeats: 'a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*'

    • a) 10000 loops, best of 5: 24.5 usec per loop
    • b) 100000 loops, best of 5: 3.17 usec per loop
    • c) 500000 loops, best of 5: 508 nsec per loop
    • d) 10000 loops, best of 5: 25.4 usec per loop
    • e) 5000 loops, best of 5: 44.7 usec per loop
    • f) 500000 loops, best of 5: 522 nsec per loop

    Input 2, with repeats: 'a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*****************************************************************************************************a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*'

    • a) 5000 loops, best of 5: 46.2 usec per loop
    • b) 50000 loops, best of 5: 5.21 usec per loop
    • c) 20000 loops, best of 5: 13.4 usec per loop
    • d) 5000 loops, best of 5: 47.4 usec per loop
    • e) 2000 loops, best of 5: 103 usec per loop
    • f) 20000 loops, best of 5: 13.1 usec per loop

    The methods:

    #!/usr/bin/env python
    # encoding: utf-8
    """
    See which function variants are fastest. Run like:
    python -mtimeit -s"import time_functions;t='a*'*100" "time_functions.a(t)"
    python -mtimeit -s"import time_functions;t='a*'*100" "time_functions.b(t)"
    etc.
    """
    import re
    
    
    def a(text):
        return re.sub(r"\*+", "*", text)
    
    
    def b(text):
        re.sub(r"\*\*+", "*", text)
    
    
    # aka squeeze()
    def c(text):
        while "*" * 2 in text:
            text = text.replace("*" * 2, "*")
        return text
    
    
    regex = re.compile(r"\*+")
    
    
    # like a() but with (premature) optimisation
    def d(text):
        return re.sub(regex, "*", text)
    
    
    def e(text):
        return "".join(c for c, n in zip(text, text[1:] + " ") if c + n != "**")
    
    
    def f(text):
        while True:
            if "**" in text:  # if two stars are in the variable pattern
                text = text.replace("**", "*")  # replace two stars with one
            else:  # otherwise
                break  # break from the infinite while loop
        return text
    

提交回复
热议问题