How to get the string between two points using regex or any other library in Python 3?
For eg: Blah blah ABC the string to be retrieved XYZ Blah Blah
ABC and
Use ABC
and XYZ
as anchors with look-behind and look-ahead assertions:
(?<=ABC).*?(?=XYZ)
The (?<=...)
look-behind assertion only matches at the location in the text that was preceded by ABC
. Similarly, (?=XYZ)
matches at the location that is followed by XYZ
. Together they form two anchors that limit the .*
expression, which matches anything.
You can find all such anchored pieces of text with re.findall()
:
for matchedtext in re.findall(r'(?<=ABC).*?(?=XYZ)', inputtext):
If ABC
and XYZ
are variable, you want to use re.escape()
(to prevent any of their content from being interpreted as regular expression syntax) on them and interpolate:
re.match(r'(?<={}).*?(?={})'.format(abc, xyz), inputtext)
I think this is what you want:
import re
match = re.search('ABC(.*)XYZ','Blah blah ABC the string to be retrieved XYZ Blah Blah')
print match.group(1)