We need to decide between Support Vector Machines and Fast Artificial Neural Network for some text processing project.
It includes Contextual Spelling Correction and the
You might want to also take a look at maxent classifiers (/log linear models).
They're really popular for NLP problems. Modern implementations, which use quasi-newton methods for optimization rather than the slower iterative scaling algorithms, train more quickly than SVMs. They also seem to be less sensitive to the exact value of the regularization hyperparameter. You should probably only prefer SVMs over maxent, if you'd like to use a kernel to get feature conjunctions for free.
As for SVMs vs. neural networks, using SVMs would probably be better than using ANNs. Like maxent models, training SVMs is a convex optimization problem. This means, given a data set and a particular classifier configuration, SVMs will consistently find the same solution. When training multilayer neural networks, the system can converge to various local minima. So, you'll get better or worse solutions depending on what weights you use to initialize the model. With ANNs, you'll need to perform multiple training runs in order to evaluate how good or bad a given model configuration is.
You can use Convolution Neural Network (CNN) or Recurrent Neural Network (RNN) to train NLP. I think CNN has achieved state-of-the-art now.
This question is very old. Lot of developments were happened in NLP area in last 7 years.
Convolutional_neural_network and Recurrent_neural_network evolved during this time.
Word Embeddings: Words appearing within similar context possess similar meaning. Word embeddings are pre-trained on a task where the objective is to predict a word based on its context.
CNN for NLP:
Sentences are first tokenized into words, which are further transformed into a word embedding matrix (i.e., input embedding layer) of d dimension.
Convolutional filters are applied on this input embedding layer to produce a feature map.
A max-pooling operation on each filter obtain a fixed length output and reduce the dimensionality of the output.
Since CNN had a short-coming of not preserving long-distance contextual information, RNNs have been introduced.
RNNs are specialized neural-based approaches that are effective at processing sequential information.
RNN memorizes the result of previous computations and use it in current computation.
There are few variations in RNN - Long Short Term Memory Unit (LSTM) and Gated recurrent units (GRUs)
Have a look at below resources:
deep-learning-for-nlp
Recent trends in deep learning paper
I think you'll get a competitive results from both of the algorithms, so you should aggregate the results... think about ensemble learning.
Update:
I don't know if this is specific enough: use Bayes Optimal Classifier to combine the prediction from each algorithm. You have to train both of your algorithms, then you have to train the Bayes Optimal Classifier to use your algorithms and make optimal predictions based on the input of the algorithms.
Separate your training data in 3:
Update 2.0:
Another way to create an ensemble of the algorithms is to use 10-fold (or more generally, k-fold) cross-validation:
Remember that you can generally combine many the classifiers and validation methods in order to produce better results. It's just a matter of finding what works best for your domain.