问题
Ok, so I think I've got a handle on negation - now what about only selecting a match that has a specified substring within it?
Given:
This is a random bit of information from 0 to 1.
This is a non-random bit of information I do NOT want to match
This is the end of this bit
This is a random bit of information from 0 to 1.
This is a random bit of information I do want to match
This is the end of this bit
And attempting the following regex:
/(?s)This is a random bit(?:(?=This is a random).)*?This is the end/g
Why isn't this working? What am I missing?
I'm using regexstorm.com for testing...
回答1:
You ruined a tempered greedy token by turning the negative lookahead into a positive one. It won't work that way because the positive lookahead requires the text to equal This is a random
at each position after This is a random bit
.
You need:
- Match the leading delimiter (
This is a random bit
) - Match all 0+ text that is not the leading/closing delimiters and not the required random text inside this block
- Match the specific string inside (
This is a random
) - Match all 0+ text that is not the leading/closing delimiters
- Match the closing delimiter (
This is the end
)
So, use
(?s)This is a random bit(?:(?!This is a random bit|This is the end|This is a random).)*This is a random(?:(?!This is a random bit|This is the end).)*This is the end
See the regex demo
(?s)
- DOTALL mode on (.
matches a newline)This is a random bit
- Leading delimiter(?: # Start of the tempered greedy token (?!This is a random bit # Leading delimiter | This is the end # Trailing delimiter | This is a random) # Sepcific string inside . # Any character )* # End of tempered greedy token
This is a random
- specified substring(?:(?!This is a random bit|This is the end).)*
- Another tempered greedy token matching any text not leading/closing delimiters up to the first...This is the end
- trailing delimiter
回答2:
I hope you understand this (?:(?=This is a random).)
can only match once, never twice if it were quantified. For example Th
can satisfy the lookahead. When the T
is consumed, the next character is h
which will never satisfy the lookahhead Th
. The next expression is evaluated, never to return to the lookahead again. Use a negative lookahead instead.
来源:https://stackoverflow.com/questions/36972013/c-sharp-regex-only-match-if-substring-exists