Are some extra settings in RUTA script needed to detect annotations with the same begin and end attributes?

半腔热情 提交于 2019-12-24 11:09:02

问题


I have a xmi output from Tika UIMA Annotator which is passed to a UIMA Ruta script for further processing. I was able to successfully import the corresponding type system and detect any MarkupAnnotations covering some fragment of text.

However the input has some MarkupAnnotations which has the same value for begin and end (so, do not cover any text). Those annotations are not recognized by the RUTA engine.

For example, the following rule is not fired:

MarkupAnnotation.name=="img" {->MARK(IMAGE}};

however in CAS Viewer I see a lot of MarkupAnnotations with the feature name equal to "img", and all of them have equal begin and end attributes.

Should I make some extra specifications in the script to catch such annotations?


回答1:


Matching on annotation with the length 0 (begin == end) is not supported by UIMA Ruta (2.6.1).

There are various reasons, for example, the sequential matching is problematic since an annotation can preceed and follow itself.

DISCLAIMER: I am a developer of UIMA Ruta



来源:https://stackoverflow.com/questions/52468812/are-some-extra-settings-in-ruta-script-needed-to-detect-annotations-with-the-sam

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!