I am trying to use NLTK toolkit to get extract place, date and time from text messages. I just installed the toolkit on my machine and I wrote this quick snippet to test it out:
Named entity recognition is not an easy problem, do not expect any library to be 100% accurate. You shouldn't make any conclusions about NLTK's performance based on one sentence. Here's another example:
sentence = "I went to New York to meet John Smith";
I get
(S
I/PRP
went/VBD
to/TO
(NE New/NNP York/NNP)
to/TO
meet/VB
(NE John/NNP Smith/NNP))
As you can see, NLTK does very well here. However, I couldn't get NLTK to recognise today
or tomorrow
as temporal expressions. You can try Stanford SUTime, it is a part of Stanford CoreNLP- I have used it before I it works quite well (it is in Java though).
The default NE chunker in nltk is a maximum entropy chunker trained on the ACE corpus (http://catalog.ldc.upenn.edu/LDC2005T09). It has not been trained to recognise dates and times, so you need to train your own classifier if you want to do that.
Have a look at http://mattshomepage.com/articles/2016/May/23/nltk_nec/, the whole process is explained very well.
Also, there is a module called timex in nltk_contrib which might help you with your needs. https://github.com/nltk/nltk_contrib/blob/master/nltk_contrib/timex.py
If you wish to correctly identify the date or time from the text messages you can use Stanford's NER.
It uses the CRF(Conditional Random Fields) Classifier. CRF is a sequential classifier. So it takes the sequences of words into consideration.
How you frame or design a sentence, accordingly you will get the classified data.
If your input sentence would have been Let's meet on wednesday at 9am.
, then Stanford NER would have correctly identified wednesday
as date and 9am
as time.
NLTK supports Stanford NER. Try using it.