Python Regular Expression Matching: ## ##

前端 未结 7 873
后悔当初
后悔当初 2021-01-25 23:31

I\'m searching a file line by line for the occurrence of ##random_string##. It works except for the case of multiple #...

pattern=\'##(.*?)##\'
prog=re.compile(p         


        
相关标签:
7条回答
  • 2021-01-25 23:43

    Try the "block comment trick": /##((?:[^#]|#[^#])+?)##/ Screenshot of working example

    0 讨论(0)
  • 2021-01-25 23:44

    Your problem is with your inner match. You use ., which matches any character that isn't a line end, and that means it matches # as well. So when it gets ###hey##, it matches (.*?) to #hey.

    The easy solution is to exclude the # character from the matchable set:

    prog = re.compile(r'##([^#]*)##')
    

    Protip: Use raw strings (e.g. r'') for regular expressions so you don't have to go crazy with backslash escapes.

    Trying to allow # inside the hashes will make things much more complicated.

    EDIT: If you do not want to allow blank inner text (i.e. "####" shouldn't match with an inner text of ""), then change it to:

    prog = re.compile(r'##([^#]+)##')
    

    + means "one or more."

    0 讨论(0)
  • 2021-01-25 23:44
    >>> import re
    >>> text= 'lala ###hey## there'
    >>> matcher= re.compile(r"##[^#]+##")
    >>> print matcher.sub("FOUND", text)
    lala #FOUND there
    >>>
    
    0 讨论(0)
  • 2021-01-25 23:55

    To match at least two hashes at either end:

    pattern='##+(.*?)##+'
    
    0 讨论(0)
  • 2021-01-25 23:56

    have you considered doing it non-regex way?

    >>> string='lala ####hey## there'
    >>> string.split("####")[1].split("#")[0]
    'hey'
    
    0 讨论(0)
  • 2021-01-25 23:57

    '^#{2,}([^#]*)#{2,}' -- any number of # >= 2 on either end

    be careful with using lazy quantifiers like (.*?) because it'd match '##abc#####' and capture 'abc###'. also lazy quantifiers are very slow

    0 讨论(0)
提交回复
热议问题