PCRE: backreferences not allowed in lookbehinds?

后端 未结 1 1706
猫巷女王i
猫巷女王i 2021-01-18 01:44

The PCRE regex /..(?<=(.)\\1)/ fails to compile: \"Subpattern references are not allowed within a lookbehind assertion.\" Interestingly it seems to be accept

1条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-01-18 02:28

    With Python's re module, group references are not supported in lookbehind, even if they match strings of some fixed length.


    Lookbehinds doesn't fully support PCRE rules. Concretely, when the regex engine reaches a lookbehind it'll try to determine it size, and then jump back to check the match.

    This size determination brings you to a choice:

    • allow variable size, then every lookbehind needs to be executed before to jump back
    • disallow variable size, then we can directly jump back

    As the first solution would be the best for us (users), it's obviously the slowest, and the hardest to develop. And so for PCRE regex, they resolved to use the second solution. The Java regex engine, for another example, allows semi-variable lookbehinds: you only need to determine the maximum size.


    I came to PCRE and Python's re module.
    I've not found anything else in PCRE documentation than this error code:

    COMPILATION ERROR CODES
    25: lookbehind assertion is not fixed length

    But in this case, the lookbehind assertion is fixed length.
    Now, here is what we can find in re documentation:

    The contained pattern must only match strings of some fixed length, meaning that abc or a|b are allowed, but a* and a{3,4} are not. Group references are not supported even if they match strings of some fixed length.

    We've got our guilty... If you want, you can try the Python's regex module , which seems to support variable length lookbehind.

    0 讨论(0)
提交回复
热议问题