I have this regex that uses forward and backward look-aheads:
import re
re.compile("<!inc\((?=.*?\)!>)|(?<=<!inc\(.*?)\)!>")
I'm trying to port it from C# to Python but keep getting the error
look-behind requires fixed-width pattern
Is it possible to rewrite this in Python without losing meaning?
The idea is for it to match something like
<!inc(C:\My Documents\file.jpg)!>
Update
I'm using the lookarounds to parse HTTP multipart text that I've modified
body = r"""------abc
Content-Disposition: form-data; name="upfile"; filename="file.txt"
Content-Type: text/plain
<!inc(C:\Temp\file.txt)!>
------abc
Content-Disposition: form-data; name="upfile2"; filename="pic.png"
Content-Type: image/png
<!inc(C:\Temp\pic.png)!>
------abc
Content-Disposition: form-data; name="note"
this is a note
------abc--
"""
multiparts = re.compile(...).split(body)
I want to just get the file path and other text when I do the split and not have to remove the opening and closing tags
Code brevity is important, but I'm open to changing the <!inc(
format if it makes the regex doable.
For paths + "everything" in the same array, just split on the opening and closing tag:
import re
p = re.compile(r'''<!inc\(|\)!>''')
awesome = p.split(body)
You say you're flexible on the closing tags, if )!>
can occur elsewhere in the code, you may want to consider changing that closing tag to something like )!/inc>
(or anything, as long as it's unique).
From the documentation:
(?<!...)
Matches if the current position in the string is not preceded by a match for .... This is called a negative lookbehind assertion. Similar to positive lookbehind assertions, the contained pattern must only match strings of some fixed length. Patterns which start with negative lookbehind assertions may match at the beginning of the string being searched.
(?<=...)
Matches if the current position in the string is preceded by a match for ... that ends at the current position. This is called a positive lookbehind assertion. (?<=abc)def will find a match in abcdef, since the lookbehind will back up 3 characters and check if the contained pattern matches. The contained pattern must only match strings of some fixed length, meaning that abc or a|b are allowed, but a* and a{3,4} are not. Note that patterns which start with positive lookbehind assertions will not match at the beginning of the string being searched; you will most likely want to use the search() function rather than the match() function:
Emphasis mine. No, I don't imagine you can port it to Python in it's current form.
import re
pat = re.compile("\<\!inc\((.*?)\)\!\>")
f = pat.match(r"<!inc(C:\My Documents\file.jpg)!>").group(1)
results in f == 'C:\My Documents\file.jpg'
In response to Jon Clements:
print re.escape("<!inc(filename)!>")
results in
\<\!inc\(filename\)\!\>
Conclusion: re.escape
seems to think they should be escaped.
来源:https://stackoverflow.com/questions/11197608/python-fixed-length-regex-required