Efficiently split a string using multiple separators and retaining each separator?

前端 未结 9 1343
野趣味
野趣味 2021-02-02 10:44

I need to split strings of data using each character from string.punctuation and string.whitespace as a separator.

Furthermore, I need for the

9条回答
  •  闹比i
    闹比i (楼主)
    2021-02-02 11:36

    Depending on the text you are dealing with, you may be able to simplify your concept of delimiters to "anything other than letters and numbers". If this will work, you can use the following regex solution:

    re.findall(r'[a-zA-Z\d]+|[^a-zA-Z\d]', text)
    

    This assumes that you want to split on each individual delimiter character even if they occur consecutively, so 'foo..bar' would become ['foo', '.', '.', 'bar']. If instead you expect ['foo', '..', 'bar'], use [a-zA-Z\d]+|[^a-zA-Z\d]+ (only difference is adding + at the very end).

提交回复
热议问题