Python Untokenize a sentence

后端 未结 10 975
名媛妹妹
名媛妹妹 2021-02-01 15:46

There are so many guides on how to tokenize a sentence, but i didn\'t find any on how to do the opposite.

 import nltk
 words = nltk.word_tokenize(\"I\'ve found          


        
10条回答
  •  北海茫月
    2021-02-01 16:18

    The reason tokenize.untokenize does not work is because it needs more information than just the words. Here is an example program using tokenize.untokenize:

    from StringIO import StringIO
    import tokenize
    
    sentence = "I've found a medicine for my disease.\n"
    tokens = tokenize.generate_tokens(StringIO(sentence).readline)
    print tokenize.untokenize(tokens)
    


    Additional Help: Tokenize - Python Docs | Potential Problem

提交回复
热议问题