How to match “anything up until this sequence of characters” in a regular expression?

后端 未结 12 2128
旧时难觅i
旧时难觅i 2020-11-22 11:51

Take this regular expression: /^[^abc]/. This will match any single character at the beginning of a string, except a, b, or c.

If you add a *

相关标签:
12条回答
  • 2020-11-22 12:23

    I ended in this stackoverflow question after looking for help to solve my problem but found no solution to it :(

    So I had to improvise... after some time I managed to reach the regex I needed:

    As you can see, I needed up to one folder ahead of "grp-bps" folder, without including last dash. And it was required to have at least one folder after "grp-bps" folder.

    Edit

    Text version for copy-paste (change 'grp-bps' for your text):

    .*\/grp-bps\/[^\/]+
    
    0 讨论(0)
  • 2020-11-22 12:25

    What you need is look around assertion like .+? (?=abc).

    See: Lookahead and Lookbehind Zero-Length Assertions

    Be aware that [abc] isn't the same as abc. Inside brackets it's not a string - each character is just one of the possibilities. Outside the brackets it becomes the string.

    0 讨论(0)
  • 2020-11-22 12:25

    For regex in Java, and I believe also in most regex engines, if you want to include the last part this will work:

    .+?(abc)
    

    For example, in this line:

    I have this very nice senabctence
    

    select all characters until "abc" and also include abc

    using our regex, the result will be: I have this very nice senabc

    Test this out: https://regex101.com/r/mX51ru/1

    0 讨论(0)
  • 2020-11-22 12:29

    You didn't specify which flavor of regex you're using, but this will work in any of the most popular ones that can be considered "complete".

    /.+?(?=abc)/
    

    How it works

    The .+? part is the un-greedy version of .+ (one or more of anything). When we use .+, the engine will basically match everything. Then, if there is something else in the regex it will go back in steps trying to match the following part. This is the greedy behavior, meaning as much as possible to satisfy.

    When using .+?, instead of matching all at once and going back for other conditions (if any), the engine will match the next characters by step until the subsequent part of the regex is matched (again if any). This is the un-greedy, meaning match the fewest possible to satisfy.

    /.+X/  ~ "abcXabcXabcX"        /.+/  ~ "abcXabcXabcX"
              ^^^^^^^^^^^^                  ^^^^^^^^^^^^
    
    /.+?X/ ~ "abcXabcXabcX"        /.+?/ ~ "abcXabcXabcX"
              ^^^^                          ^
    

    Following that we have (?={contents}), a zero width assertion, a look around. This grouped construction matches its contents, but does not count as characters matched (zero width). It only returns if it is a match or not (assertion).

    Thus, in other terms the regex /.+?(?=abc)/ means:

    Match any characters as few as possible until a "abc" is found, without counting the "abc".

    0 讨论(0)
  • 2020-11-22 12:30

    This will make sense about regex.

    1. The exact word can be get from the following regex command:

    ("(.*?)")/g

    Here, we can get the exact word globally which is belonging inside the double quotes. For Example, If our search text is,

    This is the example for "double quoted" words

    then we will get "double quoted" from that sentence.

    0 讨论(0)
  • 2020-11-22 12:31
    .*(\s)*?(?=abc)
    

    For those who want to include line breaks too.

    It will match anything and everything(including line breaks(no matter how many or even if no line break) ) until abc is found.

    0 讨论(0)
提交回复
热议问题