Replacing repeated consecutive characters in Python

后端 未结 5 649
抹茶落季
抹茶落季 2021-01-23 05:02

I need to make a function that replaces repeated, consecutive characters with a single character, for example:

 \'hiiii how are you??\' -> \'hi how are you?\'         


        
相关标签:
5条回答
  • 2021-01-23 05:54
    def dup_char_remover(input):
        output=""
        t=""
        for c in input:
            if t!=c:
                output = output + c
            t=c
        return output
    
    input = "hiiii how arrrre youuu"
    output=dup_char_remover(input)
    print(output)
    

    hi how are you

    0 讨论(0)
  • 2021-01-23 05:56

    You can try a regular expression like (.)\1+, i.e. "something, then more of the same something", and replace it with \1, i.e. "that first something".

    >>> import re
    >>> re.sub(r"(.)\1+", r"\1", 'aahhhhhhhhhh whyyyyyy')
    'ah why'
    >>> re.sub(r"(.)\1+", r"\1", 'oook. thesse aree enoughh examplles.')
    'ok. these are enough examples.'
    

    Make it a function with functools.partial (or any other way you like)

    >>> import functools
    >>> dedup = functools.partial(re.sub, r"(.)\1+", r"\1")
    >>> dedup('oook. thesse aree enoughh examplles.')
    'ok. these are enough examples.'
    
    0 讨论(0)
  • 2021-01-23 05:57
    from collections import OrderedDict
    
    def removeDupWord(word):
       return "".join(OrderedDict.fromkeys(word))
    
    def removeDupSentence(sentence):
        words = sentence.split()
        result = ''
        return ''.join([result + removeDupWord(word) + ' ' for word in words])
    
    
    sentence = 'hiiii how are you??'
    print (removeDupSentence(sentence))
    
    >>> hi how are you? 
    
    0 讨论(0)
  • 2021-01-23 06:00

    A solution can be expressed very compactly using itertools.groupby:

    >>> import itertools
    >>> ''.join(g[0] for g in itertools.groupby('hiiii how are you??'))
    'hi how are you?'
    

    itertools.groupby groups the objects in an iterable by the given key function. Groups are accumulated as long as the keys are equivalent. If no key function is given, the identity of the items are used, in this case the characters.

    Once you have them grouped by their identity, you can then join the objects into a single string. The grouped objects are returned as tuples containing the object and an internal itertools._grouper object, which for your purposes, you can ignore and extract the character.

    This can be turned into a function as follows:

    def remove_repeated_characters(s):
        groups = itertools.groupby(s)
        cleaned = ''.join(g[0] for g in groups)
        return cleaned
    

    This results in the expected values:

    >>> [remove_repeated_characters(s) 
         for s in ['hiiii how are you??','aahhhhhhhhhh whyyyyyy',
                   'foo', 'oook. thesse aree enoughh examplles.']]
    ['hi how are you?', 'ah why', 'fo', 'ok. these are enough examples.']
    
    0 讨论(0)
  • 2021-01-23 06:03

    Using a simple iteration.

    Demo:

    def cleanText(val):
        result = []
        for i in val:
            if not result:
                result.append(i)
            else:
                if result[-1] != i:
                    result.append(i)
        return "".join(result)
    
    s = ['hiiii how are you??', 'aahhhhhhhhhh whyyyyyy', 'foo', 'oook. thesse aree enoughh examplles.']
    for i in s:
        print(cleanText(i))
    

    Output:

    hi how are you?
    ah why
    fo
    ok. these are enough examples.
    
    0 讨论(0)
提交回复
热议问题