How to get topic vector of new documents and compare with pre-defined topic model in Mallet?

后端 未结 1 1396
梦毁少年i
梦毁少年i 2021-01-20 02:59

I\'m trying to somehow compare a sole document\'s topic distribution (using LDA) with, other files and their topic distributions within a previously created topic model, usi

相关标签:
1条回答
  • 2021-01-20 03:50

    First, take a look at these:

    • Developer's guide
    • Tutorial slides after slide 97
    • Code examples in the source directory: src/cc/mallet/examples

    Now, these examples show the basic functionality, but they don't show how to save and load the model if you need to separate training from testing. Basically what you need is to save both the model and the instances after training (since you need to train and test with the same pipeline), and load them before testing.

    Save model and pipeline after training:

    model.write(new File("model.dat"));
    instances.save(new File("pipeline.dat"));
    

    Load model and pipeline before testing:

    ParallelTopicModel model = ParallelTopicModel.read(new File("model.dat"));
    InstanceList instances = InstanceList.load(new File("pipeline.dat"));
    

    Hope this helps.

    0 讨论(0)
提交回复
热议问题