问题
I have a xmi output from Tika UIMA Annotator which is passed to a UIMA Ruta script for further processing. I was able to successfully import the corresponding type system and detect any MarkupAnnotations covering some fragment of text.
However the input has some MarkupAnnotations which has the same value for begin and end (so, do not cover any text). Those annotations are not recognized by the RUTA engine.
For example, the following rule is not fired:
MarkupAnnotation.name=="img" {->MARK(IMAGE}};
however in CAS Viewer I see a lot of MarkupAnnotations with the feature name equal to "img", and all of them have equal begin and end attributes.
Should I make some extra specifications in the script to catch such annotations?
回答1:
Matching on annotation with the length 0 (begin == end) is not supported by UIMA Ruta (2.6.1).
There are various reasons, for example, the sequential matching is problematic since an annotation can preceed and follow itself.
DISCLAIMER: I am a developer of UIMA Ruta
来源:https://stackoverflow.com/questions/52468812/are-some-extra-settings-in-ruta-script-needed-to-detect-annotations-with-the-sam