Splitting sentences with nltk while preserving quotes

后端 未结 2 1601
囚心锁ツ
囚心锁ツ 2021-02-15 13:50

I am using nltk to split a text into sentence units. However, I need the sentences that contain quotes to be extracted as a single unit. Right now each sentence, even if it is w

2条回答
  •  再見小時候
    2021-02-15 14:34

    Just change your print statement to this:

    print ' '.join(tokenizer.tokenize(text, realign_boundaries=True))
    

    This will join the sentences with a space instead of \n-----\n.

提交回复
热议问题