Training models using openNLP maxent

妖精的绣舞 提交于 2019-12-19 04:02:00

问题


I have gold data where I annotated all room numbers from several documents. I want to use openNLP to train a model that uses this data and classify room numbers. I am stuck on where to start. I read openNLP maxent documentation, looked at examples in opennlp.tools and now looking at opennlp.tools.ml.maxent - it seems like it is something what I should be using, but still I have no idea on how to use. Can somebody give me some basic idea on how to use openNLP maxent and where to start with? Any help will be appreciated.


回答1:


This is a minimal working example that demonstrates the usage of OpenNLP Maxent API.

It includes the following:

  • Training a maxent model from data stored in a file.
  • Storing the trained model into a file.
  • Loading the trained model from a file.
  • Using the model for classification.
  • NOTE: the outcome is the first element in each training sample
  • NOTE: the values can be arbitrary strings, e.g. xyz=s0methIng

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.zip.GZIPInputStream;

import opennlp.maxent.GIS;
import opennlp.maxent.io.GISModelReader;
import opennlp.maxent.io.SuffixSensitiveGISModelWriter;
import opennlp.model.AbstractModel;
import opennlp.model.AbstractModelWriter;
import opennlp.model.DataIndexer;
import opennlp.model.DataReader;
import opennlp.model.FileEventStream;
import opennlp.model.MaxentModel;
import opennlp.model.OnePassDataIndexer;
import opennlp.model.PlainTextFileDataReader;

...

String trainingFileName = "training-file.txt";
String modelFileName = "trained-model.maxent.gz";

// Training a model from data stored in a file.
// The training file contains one training sample per line.
// Outcome (result) is the first element on each line.
// Example:
// result=1 a=1 b=1
// result=0 a=0 b=1
// ...
DataIndexer indexer = new OnePassDataIndexer( new FileEventStream(trainingFileName)); 
MaxentModel trainedMaxentModel = GIS.trainModel(100, indexer); // 100 iterations

// Storing the trained model into a file for later use (gzipped)
File outFile = new File(modelFileName);
AbstractModelWriter writer = new SuffixSensitiveGISModelWriter((AbstractModel) trainedMaxentModel, outFile);
writer.persist();

// Loading the gzipped model from a file
FileInputStream inputStream = new FileInputStream(modelFileName);
InputStream decodedInputStream = new GZIPInputStream(inputStream);
DataReader modelReader = new PlainTextFileDataReader(decodedInputStream);
MaxentModel loadedMaxentModel = new GISModelReader(modelReader).getModel();

// Now predicting the outcome using the loaded model
String[] context = {"a=1", "b=0"};
double[] outcomeProbs = loadedMaxentModel.eval(context);
String outcome = loadedMaxentModel.getBestOutcome(outcomeProbs);


来源:https://stackoverflow.com/questions/24897358/training-models-using-opennlp-maxent

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!