Python Untokenize a sentence

后端 未结 10 983
名媛妹妹
名媛妹妹 2021-02-01 15:46

There are so many guides on how to tokenize a sentence, but i didn\'t find any on how to do the opposite.

 import nltk
 words = nltk.word_tokenize(\"I\'ve found          


        
10条回答
  •  既然无缘
    2021-02-01 16:00

    For me, it worked when I installed python nltk 3.2.5,

    pip install -U nltk
    

    then,

    import nltk
    nltk.download('perluniprops')
    
    from nltk.tokenize.moses import MosesDetokenizer
    

    If you are using insides pandas dataframe, then

    df['detoken']=df['token_column'].apply(lambda x: detokenizer.detokenize(x, return_str=True))
    

提交回复
热议问题