Need a regex to exclude certain strings

后端 未结 6 1388
无人及你
无人及你 2021-01-02 15:18

I\'m trying to get a regex that will match:

somefile_1.txt
somefile_2.txt
somefile_{anything}.txt

but not match:

somefile_1         


        
相关标签:
6条回答
  • 2021-01-02 15:34

    To obey strictly to your specification and be picky, you should rather use:

    ^somefile_(?!16\.txt$).*\.txt$
    

    so that somefile_1666.txt which is {anything} can be matched ;)

    but sometimes it is just more readable to use...:

    ls | grep -e 'somefile_.*\.txt' | grep -v -e 'somefile_16\.txt'
    
    0 讨论(0)
  • 2021-01-02 15:35

    Sometimes it's just easier to use two regular expressions. First look for everything you want, then ignore everything you don't. I do this all the time on the command line where I pipe a regex that gets a superset into another regex that ignores stuff I don't want.

    If the goal is to get the job done rather than find the perfect regex, consider that approach. It's often much easier to write and understand than a regex that makes use of exotic features.

    0 讨论(0)
  • 2021-01-02 15:38

    Without using lookahead

    somefile_(|.|[^1].+|10|11|12|13|14|15|17|18|19|.{3,}).txt
    

    Read it like: somefile_ followed by either:

    1. nothing.
    2. one character.
    3. any one character except 1 and followed by any other characters.
    4. three or more characters.
    5. either 10 .. 19 note that 16 has been left out.

    and finally followed by .txt.

    0 讨论(0)
  • 2021-01-02 15:40

    Some regex libraries allow lookahead:

    somefile(?!16\.txt$).*?\.txt
    

    Otherwise, you can still use multiple character classes:

    somefile([^1].|1[^6]|.|.{3,})\.txt
    

    or, to achieve maximum portability:

    somefile([^1].|1[^6]|.|....*)\.txt
    

    [^(16)] means: Match any character but braces, 1, and 6.

    0 讨论(0)
  • 2021-01-02 15:45

    The best solution has already been mentioned:

    somefile_(?!16\.txt$).*\.txt
    

    This works, and is greedy enough to take anything coming at it on the same line. If you know, however, that you want a valid file name, I'd suggest also limiting invalid characters:

    somefile_(?!16)[^?%*:|"<>]*\.txt
    

    If you're working with a regex engine that does not support lookahead, you'll have to consider how to make up that !16. You can split files into two groups, those that start with 1, and aren't followed by 6, and those that start with anything else:

    somefile_(1[^6]|[^1]).*\.txt
    

    If you want to allow somefile_16_stuff.txt but NOT somefile_16.txt, these regexes above are not enough. You'll need to set your limit differently:

    somefile_(16.|1[^6]|[^1]).*\.txt
    

    Combine this all, and you end up with two possibilities, one which blocks out the single instance (somefile_16.txt), and one which blocks out all families (somefile_16*.txt). I personally think you prefer the first one:

    somefile_((16[^?%*:|"<>]|1[^6?%*:|"<>]|[^1?%*:|"<>])[^?%*:|"<>]*|1)\.txt
    somefile_((1[^6?%*:|"<>]|[^1?%*:|"<>])[^?%*:|"<>]*|1)\.txt
    

    In the version without removing special characters so it's easier to read:

    somefile_((16.|1[^6]|[^1).*|1)\.txt
    somefile_((1[^6]|[^1]).*|1)\.txt
    
    0 讨论(0)
  • 2021-01-02 15:45
    somefile_(?!16).*\.txt
    

    (?!16) means: Assert that it is impossible to match the regex "16" starting at that position.

    0 讨论(0)
提交回复
热议问题