I have a text file. I need to get a list of sentences.
How can this be implemented? There are a lot of subtleties, such as a dot being used in abbreviations.
Instead of using regex for spliting the text into sentences, you can also use nltk library.
>>> from nltk import tokenize
>>> p = "Good morning Dr. Adams. The patient is waiting for you in room number 3."
>>> tokenize.sent_tokenize(p)
['Good morning Dr. Adams.', 'The patient is waiting for you in room number 3.']
ref: https://stackoverflow.com/a/9474645/2877052