RegEx - Exclude Matched Patterns

白昼怎懂夜的黑 提交于 2019-11-29 11:42:34

问题


I have the below patterns to be excluded.

make it cheaper
make it cheapere
makeitcheaper.com.au
makeitcheaper
making it cheaper
www.make it cheaper
ww.make it cheaper.com

I've created a regex to match any of these. However, I want to get everything else other than these. I am not sure how to inverse this regex I've created.

mak(e|ing) ?it ?cheaper

Above pattern matches all the strings listed. Now I want it to match everything else. How do I do it?

From the search, it seems I need something like negative lookahead / look back. But, I don't really get it. Can some one point me in the right direction?


回答1:


You can just put it in a negative look-ahead like so:

(?!mak(e|ing) ?it ?cheaper)

Just like that isn't going to work though since, if you do a matches1, it won't match since you're just looking ahead, you aren't actually matching anything, and, if you do a find1, it will match many times, since you can start from lots of places in the string where the next characters doesn't match the above.

To fix this, depending on what you wish to do, we have 2 choices:

  1. If you want to exclude all strings that are exactly one of those (i.e. "make it cheaperblahblah" is not excluded), check for start (^) and end ($) of string:

    ^(?!mak(e|ing) ?it ?cheaper$).*
    

    The .* (zero or more wild-cards) is the actual matching taking place. The negative look-ahead checks from the first character.

  2. If you want to exclude all strings containing one of those, you can make sure the look-ahead isn't matched before every character we match:

    ^((?!mak(e|ing) ?it ?cheaper).)*$
    

    An alternative is to add wild-cards to the beginning of your look-ahead (i.e. exclude all strings that, from the start of the string, contain anything, then your pattern), but I don't currently see any advantage to this (arbitrary length look-ahead is also less likely to be supported by any given tool):

    ^(?!.*mak(e|ing) ?it ?cheaper).*
    

Because of the ^ and $, either doing a find or a matches will work for either of the above (though, in the case of matches, the ^ is optional and, in the case of find, the .* outside the look-ahead is optional).


1: Although they may not be called that, many languages have functions equivalent to matches and find with regex.


The above is the strictly-regex answer to this question.

A better approach might be to stick to the original regex (mak(e|ing) ?it ?cheaper) and see if you can negate the matches directly with the tool or language you're using.

In Java, for example, this would involve doing if (!string.matches(originalRegex)) (note the !, which negates the returned boolean) instead of if (string.matches(negLookRegex)).




回答2:


The negative lookahead, I believe is what you're looking for. Maybe try:

(?!.*mak(e|ing) ?it ?cheaper)

And maybe a bit more flexible:

(?!.*mak(e|ing) *it *cheaper)

Just in case there are more than one space.



来源:https://stackoverflow.com/questions/18241463/regex-exclude-matched-patterns

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!