Python look-behind regex issue: Invalid regular expression: look-behind requires fixed-width pattern

徘徊边缘 提交于 2019-12-10 11:08:39

问题


I need to match a linebreak in-between double quotes, as in:

<p class="calibre1">“This is the first sentence.</p>
<p class="calibre1">And this is the second!”</p>

This would match </p> <p class="calibre1">

Now, I got this working with the regex (?<=“[^”]*)</p>\s*<p[^>]*>(?!“) but I get the error described in the title: "Invalid regular expression: look-behind requires fixed-width pattern" when I try to use it non-manually. I need this regex for the eBook management/editing program, Calibre, which uses Python for its regex engine. The regex above works for manually searching a book, but when I try to include the regex as a "common option" (run on each eBook conversion) I get that error.

I don't see how it's possible to do this without a variable width look-behind, since you can't know how long it will be from the left doublequote to the linebreak. Help would be much appreciated!


回答1:


Python re module, as most languages (with the notable exception of .NET), doesn't support variable length lookbehind.

Can't you use a capturing group instead ?

“[^”]*(</p>\s*<p[^>]*>)

Data in the first capturing group.




回答2:


Lookbehinds need to be zero-width, thus quantifiers are not allowed.



来源:https://stackoverflow.com/questions/23781292/python-look-behind-regex-issue-invalid-regular-expression-look-behind-requires

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!