I have a python template engine that heavily uses regexp. It uses concatenation like:
re.compile( regexp1 + \"|\" + regexp2 + \"*|\" + regexp3 + \"+\" )
<
This shouldn't match anything:
re.compile('$^')
So if you replace regexp1, regexp2 and regexp3 with '$^' it will be impossible to find a match. Unless you are using the multi line mode.
After some tests I found a better solution
re.compile('a^')
It is impossible to match and will fail earlier than the previous solution. You can replace a with any other character and it will always be impossible to match
(?!)
should always fail to match. It is the zero-width negative look-ahead. If what is in the parentheses matches then the whole match fails. Given that it has nothing in it, it will fail the match for anything (including nothing).
Or, use some list comprehension to remove the useless regexp entries and join to put them all together. Something like:
re.compile('|'.join([x for x in [regexp1, regexp2, ...] if x != None]))
Be sure to add some comments next to that line of code though :-)
You could use
\z..
This is the absolute end of string, followed by two of anything
If +
or *
is tacked on the end this still works refusing to match anything
To match an empty string - even in multiline mode - you can use \A\Z
, so:
re.compile('\A\Z|\A\Z*|\A\Z+')
The difference is that \A
and \Z
are start and end of string, whilst ^
and $
these can match start/end of lines, so $^|$^*|$^+
could potentially match a string containing newlines (if the flag is enabled).
And to fail to match anything (even an empty string), simply attempt to find content before the start of the string, e.g:
re.compile('.\A|.\A*|.\A+')
Since no characters can come before \A (by definition), this will always fail to match.
Maybe '.{0}'
?