Regular expression syntax for “match nothing”?

前端 未结 6 903
清酒与你
清酒与你 2021-01-31 00:58

I have a python template engine that heavily uses regexp. It uses concatenation like:

re.compile( regexp1 + \"|\" + regexp2 + \"*|\" + regexp3 + \"+\" )
<         


        
相关标签:
6条回答
  • 2021-01-31 01:24

    This shouldn't match anything:

    re.compile('$^')
    

    So if you replace regexp1, regexp2 and regexp3 with '$^' it will be impossible to find a match. Unless you are using the multi line mode.


    After some tests I found a better solution

    re.compile('a^')
    

    It is impossible to match and will fail earlier than the previous solution. You can replace a with any other character and it will always be impossible to match

    0 讨论(0)
  • 2021-01-31 01:33

    (?!) should always fail to match. It is the zero-width negative look-ahead. If what is in the parentheses matches then the whole match fails. Given that it has nothing in it, it will fail the match for anything (including nothing).

    0 讨论(0)
  • 2021-01-31 01:39

    Or, use some list comprehension to remove the useless regexp entries and join to put them all together. Something like:

    re.compile('|'.join([x for x in [regexp1, regexp2, ...] if x != None]))
    

    Be sure to add some comments next to that line of code though :-)

    0 讨论(0)
  • 2021-01-31 01:48

    You could use
    \z..
    This is the absolute end of string, followed by two of anything

    If + or * is tacked on the end this still works refusing to match anything

    0 讨论(0)
  • 2021-01-31 01:50

    To match an empty string - even in multiline mode - you can use \A\Z, so:

    re.compile('\A\Z|\A\Z*|\A\Z+')
    

    The difference is that \A and \Z are start and end of string, whilst ^ and $ these can match start/end of lines, so $^|$^*|$^+ could potentially match a string containing newlines (if the flag is enabled).

    And to fail to match anything (even an empty string), simply attempt to find content before the start of the string, e.g:

    re.compile('.\A|.\A*|.\A+')
    

    Since no characters can come before \A (by definition), this will always fail to match.

    0 讨论(0)
  • 2021-01-31 01:50

    Maybe '.{0}'?

    0 讨论(0)
提交回复
热议问题