tokenize a string keeping delimiters in Python

后端 未结 5 976
无人共我
无人共我 2021-02-05 10:32

Is there any equivalent to str.split in Python that also returns the delimiters?

I need to preserve the whitespace layout for my output after processing som

5条回答
  •  粉色の甜心
    2021-02-05 11:29

    Have you looked at pyparsing? Example borrowed from the pyparsing wiki:

    >>> from pyparsing import Word, alphas
    >>> greet = Word(alphas) + "," + Word(alphas) + "!"
    >>> hello1 = 'Hello, World!'
    >>> hello2 = 'Greetings, Earthlings!'
    >>> for hello in hello1, hello2:
    ...     print (u'%s \u2192 %r' % (hello, greet.parseString(hello))).encode('utf-8')
    ... 
    Hello, World! → (['Hello', ',', 'World', '!'], {})
    Greetings, Earthlings! → (['Greetings', ',', 'Earthlings', '!'], {})
    

提交回复
热议问题