Questions about creating stanford CoreNLP training models

前端 未结 2 1051
小蘑菇
小蘑菇 2021-01-16 12:00

I\'ve been working with Stanford\'s coreNLP to perform sentiment analysis on some data I have and I\'m working on creating a training model. I know we can create a training

相关标签:
2条回答
  • 2021-01-16 12:21
    1. dev.txt should be the same as train.txt just with a different set of sentences. Note that the same sentence should not appear in dev.txt and train.txt. The development set is used to evaluate the quality of the model you train on the training data.

    2. We don't distribute a tool for tagging sentiment data. This class could be helpful in building data: http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/sentiment/BuildBinarizedDataset.html

    3. Here are the sizes of the train, dev, and test sets used for the sentiment model: train=8544, dev=1101, test=2210

    0 讨论(0)
  • 2021-01-16 12:38

    Here is some sample code for evaluating a model

    // load a model
    SentimentModel model = SentimentModel.loadSerialized(modelPath);
    
    // load devTrees
    List<Tree> devTrees;
    devTrees = SentimentUtils.readTreesWithGoldLabels(devPath);
    
    // evaluate on devTrees
    Evaluate eval = new Evaluate(model);
    eval.eval(devTrees);
    eval.printSummary();
    

    You can find what you need to import, etc... by looking at:

    edu/stanford/nlp/sentiment/SentimentTraining.java

    0 讨论(0)
提交回复
热议问题