This questions describes how to reuse a pipeline in dkpro but if I only create one JCas and then try to change the text then I get the exception
org.apache.uima.cas.CASRuntimeException: Data for Sofa feature setLocalSofaData() has already been set.
How do I get around this?
The sofa data in the CAS can only be set once. It cannot be modified after it has been set.
In order to re-use a CAS, call the reset()
method on it. This clears all annotations and allows you to set the sofa/text again.
To build a CAS incrementally, a common strategies is to add annotations to the CAS while adding text to a string buffer and setting the text only at the end of the process.
An uimaFIT-based example could look something like this:
Strings[] texts = {
"Hello world.",
"This is a test." };
// Create empty CAS/JCas initialized using uimaFIT typesystem auto-detection
JCas jcas = JCasFactory.createJCas();
// Instantiate some analysis engine
AnalysisEngine engine = AnalysisEngineFactory.createEngine(...);
// Process texts re-using the previously created CAS/JCas instance
for (String t : texts) {
jcas.reset();
jcas.setDocumentText(t);
jcas.setDocumentLanguage("en");
engine.process(jcas);
}
engine.collectionProcessComplete();
engine.destroy();
Disclosure: I am working on the Apache UIMA project.
来源:https://stackoverflow.com/questions/37771028/change-text-in-reusable-pipeline-in-dkpro