I am using nltk to split a text into sentence units. However, I need the sentences that contain quotes to be extracted as a single unit. Right now each sentence, even if it is w
Just change your print statement to this:
print ' '.join(tokenizer.tokenize(text, realign_boundaries=True))
This will join the sentences with a space instead of \n-----\n.
\n-----\n