问题
Regex:
\b< low="" number="" low="">\b
Example string:
<b22>Aquí se muestran algunos síntomas < low="" number="" low=""> tienen el siguiente aspecto.</b22>
I'm not sure why the word boundary between síntomas and < is not being found. Same problem exists on the other side between > and tienen
Suggestions on how I might more properly match this boundary?
When I give it the following input, the Regex matches as expected:
Aquí se muestran algunos síntomas< low="" number="" low="">tienen el siguiente aspecto.
removing the edge conditions \b \bPHRASE\b
are not an option because it cannot match parts of words
Update
This did the trick: (Thanks to Igor, Mosty, DK and NickC)
Regex(String.Format(@"(?<=[\s\.\?\!]){0}(?=[\s\.\?\!])", innerStringToMatch);
I needed to improve my boundary matching to [\s\.\?\!]
and make these edge matches positive lookahead and lookbehind.
回答1:
\b
is a zero-length match which can occur between two characters in the string, where one is a word character and the other is not a word character. Word character is defined as [A-Za-z0-9_]*. <
is not a word character, that's why \b
doesn't match.
You can try the following regex instead ((?: )
is a non-capturing parentheses group):
(?:\b|\s+)< low="" number="" low="">(?:\b|\s+)
*) Actually, this is not correct for all regex engines. To be precise, \b matches between \w
and \W
, where \w
matches any word character. As Tim Pietzcker pointed out in the comment to this answer, the meaning of "word character" differs between implementations, but I don't know any where \w
matches <
or >
.
回答2:
I think you're trying to do the following:
\s< low="" number="" low="">\s
来源:https://stackoverflow.com/questions/9087521/regex-word-boundary-issue-when-angle-brackets-are-adjacent-to-the-boundary