发表新帖

发表新帖

Regex lazy quantifier behave greedy

后端未结

关注

 2  1147

逝去的感伤

I have a text like this;

[Some Text][1][Some Text][2][Some Text][3][Some Text][4]

I want to match [Some Text][2] with this regex;

相关标签:

2条回答

南旧

2020-11-29 14:27
You could try the below regex,
```
(?!^)(\[[A-Z].*?\]\[\d+\])    
```
DEMO
0 讨论(0)
发布评论:

提交评论
- 加载中...
遇见更好的自我

2020-11-29 14:29
The \[.*?\]\[2\] pattern works like this:
- \[ - finds the leftmost [ (as the regex engine processes the string input from left to right)
- .*? - matches any 0+ chars other than line break chars, as few as possible, but as many as needed for a successful match, as there are subsequent patterns, see below
- \]\[2\] - ][2] substring.
So, the .*? gets expanded upon each failure until it finds the leftmost ][2]. Note the lazy quantifiers do not guarantee the "shortest" matches.

Solution

Instead of a .*? (or .*) use negated character classes that match any char but the boundary char.
```
\[[^\]\[]*\]\[2\]
```
See this regex demo.

Here, .*? is replaced with [^\]\[]* - 0 or more chars other than ] and [.

Other examples:
- <[^<>]*> matches <...> with no < and > inside
- \([^()]*\) matches (...) with no ( and ) inside
- "[^"]*" matches "..." with no " inside
In other situations, when the starting pattern is a multichar string or complex pattern, use a tempered greedy token, (?:(?!start).)*?. To match abc 1 def in abc 0 abc 1 def, use abc(?:(?!abc).)*?def.
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题