I'm working on a Maven project which dynamically executes some ruta scripts to annotate some tags and process the output in java.
Now that I want to use NLP (mostly dkpro) first and then pass the output to the ruta scripts (pipeline) and process further. How to achieve it ?
Edited:
Below is my new script;
AnalysisEngineDescription pipeline = createEngineDescription(createEngineDescription(OpenNlpSegmenter.class),
createEngineDescription(OpenNlpPosTagger.class),
AnalysisEngineFactory.createEngineDescription(RutaEngine.class, RutaEngine.PARAM_MAIN_SCRIPT,
"com.textjuicer.ruta.date.Author_updated"),
createEngineDescription(ConsoleWriter.class));
Error:
Not able to resolve type: Reference
May 25, 2016 6:45:43 PM org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl processAndOutputNewCASes(273)
SEVERE: Exception occurred
org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:563)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:378)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_impl.java:410)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_impl.java:410)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:170)
at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:191)
at com.textjuicer.ruta.date.ArtifactAnnotator.runNLP(ArtifactAnnotator.java:225)
at com.textjuicer.ruta.date.ArtifactAnnotator.getAllAnnotations(ArtifactAnnotator.java:70)
at com.textjuicer.ruta.date.ArtifactAnnotator.main(ArtifactAnnotator.java:38)
Caused by: java.lang.IllegalArgumentException: Not able to resolve type: Reference
at org.apache.uima.ruta.expression.type.SimpleTypeExpression.getType(SimpleTypeExpression.java:48)
at org.apache.uima.ruta.rule.RegExpRule.getGroup2Types(RegExpRule.java:148)
at org.apache.uima.ruta.rule.RegExpRule.apply(RegExpRule.java:80)
at org.apache.uima.ruta.RutaScriptBlock.apply(RutaScriptBlock.java:63)
at org.apache.uima.ruta.RutaModule.apply(RutaModule.java:48)
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:561)
... 17 more
Exception in thread "main" org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator processing failed.
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:563)
at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:378)
at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:298)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_impl.java:410)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:568)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.(ASB_impl.java:410)
at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:343)
at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265)
at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:267)
at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:170)
at org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:191)
at com.textjuicer.ruta.date.ArtifactAnnotator.runNLP(ArtifactAnnotator.java:225)
at com.textjuicer.ruta.date.ArtifactAnnotator.getAllAnnotations(ArtifactAnnotator.java:70)
at com.textjuicer.ruta.date.ArtifactAnnotator.main(ArtifactAnnotator.java:38)
Caused by: java.lang.IllegalArgumentException: Not able to resolve type: Reference
at org.apache.uima.ruta.expression.type.SimpleTypeExpression.getType(SimpleTypeExpression.java:48)
at org.apache.uima.ruta.rule.RegExpRule.getGroup2Types(RegExpRule.java:148)
at org.apache.uima.ruta.rule.RegExpRule.apply(RegExpRule.java:80)
at org.apache.uima.ruta.RutaScriptBlock.apply(RutaScriptBlock.java:63)
at org.apache.uima.ruta.RutaModule.apply(RutaModule.java:48)
at org.apache.uima.ruta.engine.RutaEngine.process(RutaEngine.java:561)
... 17 more
You can add Ruta script simply as an analysis engine at the end of your DKPro Pipeline. The exact code mainly depends on how you build and run your pipeline.
Adapted from the uimafit documentation:
// your collecton reader
CollectionReaderDescription reader =
CollectionReaderFactory.createReaderDescription(
TextReader.class,
TextReader.PARAM_INPUT, "/home/uimafit/documents");
// some DKPro Code component
AnalysisEngineDescription dkpro=
AnalysisEngineFactory.createEngineDescription(
Tokenizer.class);
AnalysisEngineDescription ruta =
AnalysisEngineFactory.createEngineDescription(
RutaEngine.class,
RutaEngine.PARAM_MAIN_SCRIPT, "Main.ruta");
// some writer
AnalysisEngineDescription writer=
AnalysisEngineFactory.createEngineDescription(
XmiWriter.class,
XmiWriter.PARAM_OUTPUT, "/home/uimafit/output");
SimplePipeline.runPipeline(reader, dkpro, ruta, writer);
You can create an analysis engine of your Ruta script by using the uimaFIT factories by either specifying the mainScript parameter or by directly configuring the rules with PARAM_RULES. You can also use the xml descriptor of the Ruta script to create the analysis engine.
If the ruta script declares new types, then either the xml descriptor has to be used to create the analysis engine, or the types.txt file of uimaFIT needs to be extended by the generated type system of the script. (... or the type system need to be included in some other way.)
If the ruta script imports and calls other scripts, then the generated descriptor need to be used, or the corresponding parameters need to be set correctly, e.g., additionalScripts. Same is true for imported analysis engines.
If you import the NLP/DKPro typesystem in your Ruta script, then you can simply write rules using the DKPro annotations.
(I am a developer of UIMA Ruta)
来源:https://stackoverflow.com/questions/37404738/how-to-create-pipeline-of-java-nlp-and-ruta-scripts