How can I split a text into sentences?

前端未结

关注

 13  1031

傲寒

I have a text file. I need to get a list of sentences.

How can this be implemented? There are a lot of subtleties, such as a dot being used in abbreviations.

相关标签:

13条回答

渐次进展

2020-11-22 07:02

Instead of using regex for spliting the text into sentences, you can also use nltk library.

>>> from nltk import tokenize
>>> p = "Good morning Dr. Adams. The patient is waiting for you in room number 3."

>>> tokenize.sent_tokenize(p)
['Good morning Dr. Adams.', 'The patient is waiting for you in room number 3.']

ref: https://stackoverflow.com/a/9474645/2877052

0 讨论(0)

上一页 1 2 3