Do backreferences need to come after the group they reference?

家住魔仙堡 提交于 2019-12-01 20:01:40

Your question got me thinking too, so I ran a few tests with RegexBuddy and to my surprise the second regex (?<=\1.*)(<[^>]+>) which you said didn't work actually worked and the others worked exactly like you said. I then tried the same expression - the second one - in C# code but it didn't work like what happened with you.

This got me confused, then I noticed that my RegexBuddy version dates back to 2008 so there must have been some change in how the .NET engine works, but this shed the light on a fact I though is rational, it seems that before 2008 lookbehinds were evaluated after the rest of the expression matched. I felt this behavior is a bit acceptable with lookbehinds since you need to match something before you look behind to match something before it.

Nevertheless, the engines these days seem to evaluate lookarounds when it encounters them and I was able to find this out by using the following expression which is like the reverse situation of your case:

(?<=(\w))\1

As you can see I captured a word character inside the regex and referenced it outside it. I tested this on the string hello and it matched at the second l character as expected and this proves that the lookbehind was executed before attempting to match the rest of the expression.

Conclusion: Yes, a back reference need to appear after the group it references or it will have no match semantics.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!