Regex lookahead only removing the last character

拥有回忆 提交于 2019-12-02 02:49:21

问题


Im creating a regex that searches for a text, but only if there isnt a dash after the match. Im using lookahead for this:

  • Regex: Text[\s\.][0-9]*(?!-)

Expected result Result --------------- ------- Text 11 Text 11 Text 11 Text 52- <No Match> Text 5

Test case: https://regex101.com/r/doklxc/1/

The lookahead only seems to be matching with the previous character, which leaves me with Text 5, while I need it to not return a match at all.

Im checking the https://www.regular-expressions.info/ guides and tried using groups, but I cant wrap my head around this one.

How can I make it so the lookbehind function affects the entire preceding match?

Im using the default .Net Text.RegularExpressions library.


回答1:


The [0-9]* backtracks and lets the regex engine find a match even if there is a -.

There are two ways: either use atomic groups or check for a digit in the lookahead:

Text[\s.][0-9]*(?![-\d])

Or

Text(?>[\s.][0-9]*)(?!-)

See the regex demo #1 and the regex demo #2.

Details

  • Text[\s.][0-9]*(?![-\d]) matches Text, then a dot or a whitespace, then 0 or more digits, and then it checks of there is a - or digit immediately to the right, and if there is, it fails the match. Even when trying to backtrack and match fewer digits than it grabbed before, the \d in the lookahead will fail those attempts
  • Text(?>[\s.][0-9]*)(?!-) matches Text, then an atomic group starts where backtracking won't be let in after the group patterns find their matching text. (?!-) only checks for a - after the [0-9]* pattern tries to grab any digits.


来源:https://stackoverflow.com/questions/52835742/regex-lookahead-only-removing-the-last-character

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!