Change text in reusable pipeline in DKPro

你离开我真会死。 提交于 2019-12-06 15:18:38

问题


This questions describes how to reuse a pipeline in dkpro but if I only create one JCas and then try to change the text then I get the exception

org.apache.uima.cas.CASRuntimeException: Data for Sofa feature setLocalSofaData() has already been set.

How do I get around this?


回答1:


The sofa data in the CAS can only be set once. It cannot be modified after it has been set.

In order to re-use a CAS, call the reset() method on it. This clears all annotations and allows you to set the sofa/text again.

To build a CAS incrementally, a common strategies is to add annotations to the CAS while adding text to a string buffer and setting the text only at the end of the process.

An uimaFIT-based example could look something like this:

Strings[] texts = {
    "Hello world.",
    "This is a test." };

// Create empty CAS/JCas initialized using uimaFIT typesystem auto-detection
JCas jcas = JCasFactory.createJCas();

// Instantiate some analysis engine
AnalysisEngine engine = AnalysisEngineFactory.createEngine(...);

// Process texts re-using the previously created CAS/JCas instance
for (String t : texts) {
    jcas.reset();
    jcas.setDocumentText(t);
    jcas.setDocumentLanguage("en");
    engine.process(jcas);
}

engine.collectionProcessComplete();
engine.destroy();

Disclosure: I am working on the Apache UIMA project.



来源:https://stackoverflow.com/questions/37771028/change-text-in-reusable-pipeline-in-dkpro

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!