How I classify a word of a text in things like names, number, money, date,etc?
问题 I did some questions about text-mining a week ago, but I was a bit confused and still, but now I know wgat I want to do. The situation: I have a lot of download pages with HTML content. Some of then can bean be a text from a blog, for example. They are not structured and came from different sites. What I want to do: I will split all the words with whitespace and I want to classify each one or a group of ones in some pre-defined itens like names, numbers, phone, email, url, date, money,