tokenize a string keeping delimiters in Python

后端 未结 5 972
无人共我
无人共我 2021-02-05 10:32

Is there any equivalent to str.split in Python that also returns the delimiters?

I need to preserve the whitespace layout for my output after processing som

5条回答
  •  情书的邮戳
    2021-02-05 11:25

    Thanks guys for pointing for the re module, I'm still trying to decide between that and using my own function that returns a sequence...

    def split_keep_delimiters(s, delims="\t\n\r "):
        delim_group = s[0] in delims
        start = 0
        for index, char in enumerate(s):
            if delim_group != (char in delims):
                delim_group ^= True
                yield s[start:index]
                start = index
        yield s[start:index+1]
    

    If I had time I'd benchmark them xD

提交回复
热议问题