Remove text between () and [] in python

前端 未结 4 1635
我寻月下人不归
我寻月下人不归 2020-11-28 08:44

I have a very long string of text with () and [] in it. I\'m trying to remove the characters between the parentheses and brackets but I cannot figu

相关标签:
4条回答
  • 2020-11-28 08:57

    This should work for parens. regular expressions will 'consume' the text it has matched so it won't work for nested parens.

    import re
    regex = re.compile(".*?\((.*?)\)")
    result = re.findall(regex, mystring)
    

    or this would find one set of parens... simply loop to find more

    start = mystring.find( '(' )
    end = mystring.find( ')' )
    if start != -1 and end != -1:
      result = mystring[start+1:end]
    
    0 讨论(0)
  • 2020-11-28 09:04

    Here's a solution similar to @pradyunsg's answer (it works with arbitrary nested brackets):

    def remove_text_inside_brackets(text, brackets="()[]"):
        count = [0] * (len(brackets) // 2) # count open/close brackets
        saved_chars = []
        for character in text:
            for i, b in enumerate(brackets):
                if character == b: # found bracket
                    kind, is_close = divmod(i, 2)
                    count[kind] += (-1)**is_close # `+1`: open, `-1`: close
                    if count[kind] < 0: # unbalanced bracket
                        count[kind] = 0  # keep it
                    else:  # found bracket to remove
                        break
            else: # character is not a [balanced] bracket
                if not any(count): # outside brackets
                    saved_chars.append(character)
        return ''.join(saved_chars)
    
    print(repr(remove_text_inside_brackets(
        "This is a sentence. (once a day) [twice a day]")))
    # -> 'This is a sentence.  '
    
    0 讨论(0)
  • 2020-11-28 09:09

    You can use re.sub function.

    >>> import re 
    >>> x = "This is a sentence. (once a day) [twice a day]"
    >>> re.sub("([\(\[]).*?([\)\]])", "\g<1>\g<2>", x)
    'This is a sentence. () []'
    

    If you want to remove the [] and the () you can use this code:

    >>> import re 
    >>> x = "This is a sentence. (once a day) [twice a day]"
    >>> re.sub("[\(\[].*?[\)\]]", "", x)
    'This is a sentence.  '
    

    Important: This code will not work with nested symbols

    0 讨论(0)
  • 2020-11-28 09:14

    Run this script, it works even with nested brackets.
    Uses basic logical tests.

    def a(test_str):
        ret = ''
        skip1c = 0
        skip2c = 0
        for i in test_str:
            if i == '[':
                skip1c += 1
            elif i == '(':
                skip2c += 1
            elif i == ']' and skip1c > 0:
                skip1c -= 1
            elif i == ')'and skip2c > 0:
                skip2c -= 1
            elif skip1c == 0 and skip2c == 0:
                ret += i
        return ret
    
    x = "ewq[a [(b] ([c))]] This is a sentence. (once a day) [twice a day]"
    x = a(x)
    print x
    print repr(x)
    

    Just incase you don't run it,
    Here's the output:

    >>> 
    ewq This is a sentence.  
    'ewq This is a sentence.  ' 
    
    0 讨论(0)
提交回复
热议问题