I have a string which is like this:
this is \"a test\"
I\'m trying to write something in Python to split it up by space while ignoring spac
It seems that for performance reasons re
is faster. Here is my solution using a least greedy operator that preserves the outer quotes:
re.findall("(?:\".*?\"|\S)+", s)
Result:
['this', 'is', '"a test"']
It leaves constructs like aaa"bla blub"bbb
together as these tokens are not separated by spaces. If the string contains escaped characters, you can match like that:
>>> a = "She said \"He said, \\\"My name is Mark.\\\"\""
>>> a
'She said "He said, \\"My name is Mark.\\""'
>>> for i in re.findall("(?:\".*?[^\\\\]\"|\S)+", a): print(i)
...
She
said
"He said, \"My name is Mark.\""
Please note that this also matches the empty string ""
by means of the \S
part of the pattern.