Change text in reusable pipeline in DKPro

拈花ヽ惹草 提交于 2019-12-04 21:39:22

The sofa data in the CAS can only be set once. It cannot be modified after it has been set.

In order to re-use a CAS, call the reset() method on it. This clears all annotations and allows you to set the sofa/text again.

To build a CAS incrementally, a common strategies is to add annotations to the CAS while adding text to a string buffer and setting the text only at the end of the process.

An uimaFIT-based example could look something like this:

Strings[] texts = {
    "Hello world.",
    "This is a test." };

// Create empty CAS/JCas initialized using uimaFIT typesystem auto-detection
JCas jcas = JCasFactory.createJCas();

// Instantiate some analysis engine
AnalysisEngine engine = AnalysisEngineFactory.createEngine(...);

// Process texts re-using the previously created CAS/JCas instance
for (String t : texts) {
    jcas.reset();
    jcas.setDocumentText(t);
    jcas.setDocumentLanguage("en");
    engine.process(jcas);
}

engine.collectionProcessComplete();
engine.destroy();

Disclosure: I am working on the Apache UIMA project.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!