Training models using openNLP maxent

前端 未结 1 2054
一个人的身影
一个人的身影 2021-01-06 16:10

I have gold data where I annotated all room numbers from several documents. I want to use openNLP to train a model that uses this data and classify room numbers. I am stuck

相关标签:
1条回答
  • 2021-01-06 16:57

    This is a minimal working example that demonstrates the usage of OpenNLP Maxent API.

    It includes the following:

    • Training a maxent model from data stored in a file.
    • Storing the trained model into a file.
    • Loading the trained model from a file.
    • Using the model for classification.
    • NOTE: the outcome is the first element in each training sample
    • NOTE: the values can be arbitrary strings, e.g. xyz=s0methIng

    import java.io.File;
    import java.io.FileInputStream;
    import java.io.IOException;
    import java.io.InputStream;
    import java.util.zip.GZIPInputStream;
    
    import opennlp.maxent.GIS;
    import opennlp.maxent.io.GISModelReader;
    import opennlp.maxent.io.SuffixSensitiveGISModelWriter;
    import opennlp.model.AbstractModel;
    import opennlp.model.AbstractModelWriter;
    import opennlp.model.DataIndexer;
    import opennlp.model.DataReader;
    import opennlp.model.FileEventStream;
    import opennlp.model.MaxentModel;
    import opennlp.model.OnePassDataIndexer;
    import opennlp.model.PlainTextFileDataReader;
    
    ...
    
    String trainingFileName = "training-file.txt";
    String modelFileName = "trained-model.maxent.gz";
    
    // Training a model from data stored in a file.
    // The training file contains one training sample per line.
    // Outcome (result) is the first element on each line.
    // Example:
    // result=1 a=1 b=1
    // result=0 a=0 b=1
    // ...
    DataIndexer indexer = new OnePassDataIndexer( new FileEventStream(trainingFileName)); 
    MaxentModel trainedMaxentModel = GIS.trainModel(100, indexer); // 100 iterations
    
    // Storing the trained model into a file for later use (gzipped)
    File outFile = new File(modelFileName);
    AbstractModelWriter writer = new SuffixSensitiveGISModelWriter((AbstractModel) trainedMaxentModel, outFile);
    writer.persist();
    
    // Loading the gzipped model from a file
    FileInputStream inputStream = new FileInputStream(modelFileName);
    InputStream decodedInputStream = new GZIPInputStream(inputStream);
    DataReader modelReader = new PlainTextFileDataReader(decodedInputStream);
    MaxentModel loadedMaxentModel = new GISModelReader(modelReader).getModel();
    
    // Now predicting the outcome using the loaded model
    String[] context = {"a=1", "b=0"};
    double[] outcomeProbs = loadedMaxentModel.eval(context);
    String outcome = loadedMaxentModel.getBestOutcome(outcomeProbs);
    
    0 讨论(0)
提交回复
热议问题