I need to one give me the string between ~
and ^
.
I have a string like this:
~~~~ ABC ^ DEF ^ HGK > LMN ^
I
Without regex:
>>> "".join([x for x in target if x.isalpha() or x == ' ']).split()
['ABC', 'DEF', 'HGK', 'LMN']
This takes space and alpha characters and creates a new string then splits it into words in a list
Here is my exact code from python 3 command line:
>>> target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
>>> xx = "".join([x for x in target if x.isalpha() or x == ' ']).split()
>>> xx
['ABC', 'DEF', 'HGK', 'LMN']
>>>
I'm not sure exactly what result is desired, but perhaps this?
>>> matchObj = re.findall(r'~+(.*?)\^', target)
>>> print(matchObj)
[' ABC ']
here is my solution:
your input:
In [12]: target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
replace all the symbols or delimiters with ' '
and split the result
In [13]: b = re.sub(r'[^\w]', ' ', target).split()
In [14]: b
Out[14]: ['ABC', 'DEF', 'HGK', 'LMN']
Your idea of using a lazy quantifier is good, but that still doesn't necessarily give you the shortest possible match - only the shortest match from the current position of the regex engine. If you want to disallow the start/end separators from being part of the match, you need to explicitly exclude them from the list of valid characters. A negated character class comes in handy here.
target = ' ~~~~ ABC ^ DEF ^ HGK > LMN ^ '
matches = re.findall(r'~([^~^]*)\^', target)
print matches