Lets say I have a Text file with the below content
fdsjhgjhg
fdshkjhk
Start
Good Morning
Hello World
End
dashjkhjk
dsfjkhk
Start
hgjkkl
dfghjjk
If you don't expect to get nested structures, you could do this:
# match everything between "Start" and "End"
occurences = re.findall(r"Start(.*?)End", text, re.DOTALL)
# discard text before duplicated occurences of "Start"
occurences = [oc.rsplit("Start", 1)[-1] for oc in occurences]
# optionally trim whitespaces
occurences = [oc.strip("\n") for oc in occurences]
Which prints
>>> for oc in occurences: print(oc)
Good Morning
Hello World
Good Evening
Good
You can add the \n
as part of Start
and End
if you want