I need to split strings of data using each character from string.punctuation
and string.whitespace
as a separator.
Furthermore, I need for the
Depending on the text you are dealing with, you may be able to simplify your concept of delimiters to "anything other than letters and numbers". If this will work, you can use the following regex solution:
re.findall(r'[a-zA-Z\d]+|[^a-zA-Z\d]', text)
This assumes that you want to split on each individual delimiter character even if they occur consecutively, so 'foo..bar'
would become ['foo', '.', '.', 'bar']
. If instead you expect ['foo', '..', 'bar']
, use [a-zA-Z\d]+|[^a-zA-Z\d]+
(only difference is adding +
at the very end).