Whats the best method to use the words itself as the features in any machine learning algorithm ?
The problem I have to extract word related feature from a particular p
Standard approach is the "bag-of-words" representation where you have one feature per word, giving "1" if the word occurs in the document and "0" if it doesn't occur.
This gives lots of features, but if you have a simple learner like Naive Bayes, that's still OK.
"Index in the dictionary" is a useless feature, I wouldn't use it.