How to create pipeline of java nlp and ruta scripts?

ぃ、小莉子 提交于 2019-12-06 07:02:23

You can add Ruta script simply as an analysis engine at the end of your DKPro Pipeline. The exact code mainly depends on how you build and run your pipeline.

Adapted from the uimafit documentation:

// your collecton reader
CollectionReaderDescription reader = 
  CollectionReaderFactory.createReaderDescription(
    TextReader.class, 
    TextReader.PARAM_INPUT, "/home/uimafit/documents");

// some DKPro Code component
AnalysisEngineDescription dkpro= 
  AnalysisEngineFactory.createEngineDescription(
    Tokenizer.class);

AnalysisEngineDescription ruta = 
  AnalysisEngineFactory.createEngineDescription(
    RutaEngine.class, 
    RutaEngine.PARAM_MAIN_SCRIPT, "Main.ruta");

// some writer
AnalysisEngineDescription writer= 
  AnalysisEngineFactory.createEngineDescription(
    XmiWriter.class, 
    XmiWriter.PARAM_OUTPUT, "/home/uimafit/output");

SimplePipeline.runPipeline(reader, dkpro, ruta, writer);

You can create an analysis engine of your Ruta script by using the uimaFIT factories by either specifying the mainScript parameter or by directly configuring the rules with PARAM_RULES. You can also use the xml descriptor of the Ruta script to create the analysis engine.

If the ruta script declares new types, then either the xml descriptor has to be used to create the analysis engine, or the types.txt file of uimaFIT needs to be extended by the generated type system of the script. (... or the type system need to be included in some other way.)

If the ruta script imports and calls other scripts, then the generated descriptor need to be used, or the corresponding parameters need to be set correctly, e.g., additionalScripts. Same is true for imported analysis engines.

If you import the NLP/DKPro typesystem in your Ruta script, then you can simply write rules using the DKPro annotations.

(I am a developer of UIMA Ruta)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!