uima

Setting feature value to the count of containing annotation in UIMA Ruta

白昼怎懂夜的黑 提交于 2019-12-22 09:55:54
问题 I've got a RUTA script where all the sentences have been annotated with a Sentence annotation and various words and phrases have been annotated with their own specific annotations. That all works as expected. Each one of those annotations has a feature for the index of the sentence that contains it. So in a contrived example and given the text Jack and Jill went up the hill. Jack fell down. I have a "down" annotation that I want set the sentence index to 2, indicating that it is in the second

UIMA RUTA - how to do find & replace using regular expression and groups

独自空忆成欢 提交于 2019-12-22 09:24:24
问题 RUTA newbie here. I'm processing a document using RUTA and have a lot of normalization to do before I can start annotating. I'm trying to find the best way to do a Find and Replace of sequence of characters using regular expressions and groups on the original document in RUTA. In essence, I'm trying to see how to do something similar to a String.replaceAll in RUTA. For example, in Java, inputString = inputString.replaceAll( "(?i)7\\s*\\(SEVEN\\)", "7"); But I can't figure out a simple way to

UIMA RUTA - how to do find & replace using regular expression and groups

£可爱£侵袭症+ 提交于 2019-12-22 09:23:21
问题 RUTA newbie here. I'm processing a document using RUTA and have a lot of normalization to do before I can start annotating. I'm trying to find the best way to do a Find and Replace of sequence of characters using regular expressions and groups on the original document in RUTA. In essence, I'm trying to see how to do something similar to a String.replaceAll in RUTA. For example, in Java, inputString = inputString.replaceAll( "(?i)7\\s*\\(SEVEN\\)", "7"); But I can't figure out a simple way to

How to reconfigure uima ruta analysis engine (change the parameter values) programmatically?

北城余情 提交于 2019-12-14 00:25:23
问题 This is in continuation with the question: How to run external ruta scripts from a maven project without placing the script or its typesystem in the classpath? Please guide me to reconfigure analysis engine (by changing the parameter values) programmatically. 回答1: Situation: you have a correct xml descriptor of a UIMA Ruta analysis engine and you want to reconfigure so that the paths point to the folder of the descriptor.java url to file The following code illustrates that by changing the

UIMA Ruta: Editor could not be initialized

情到浓时终转凉″ 提交于 2019-12-13 02:22:25
问题 I am new to UIMA Ruta and I am currently trying to get a simple HelloWorld script to run. I followed the instructions here to set up my HelloWorld project. The first error that occured was java.lang.NoClassDefFoundError: org/slf4j/event/Logger which I resolved by converting my project to a maven project and adding the slf4j-api 2.0.0-alpha1 and ruta-core 2.7.0 dependencies to pom.xml. Now, my HelloWorld script generates an output file to the output folder. But when I try to open it with the

UIMA-Ducc vs UIMA-AS

徘徊边缘 提交于 2019-12-11 11:50:08
问题 I used UIMA in a process for analyzing and extracting information since text. The pipeline fails with 6 simultaneous processes. I think that I need to use a scaleout tool, like UIMA-Ducc and UIMA-AS , but I don't see clearly which. When to use each one? Which are their differences? 回答1: UIMA-AS provides mechanisms for deploying a UIMA pipeline. Essentially, UIMA-AS allows users to put a queue in front of a UIMA component so that it can run in a different thread or in a different process. UIMA

CAS consumer not working as expected

南楼画角 提交于 2019-12-11 11:39:01
问题 I have a CAS consumer AE which is expected to iterates over CAS objects in a pipeline, serialize them and add the serialized CASs to an xml file. public class DataWriter extends JCasConsumer_ImplBase { private File outputDirectory; public static final String PARAM_OUTPUT_DIRECTORY = "outputDir"; @ConfigurationParameter(name=PARAM_OUTPUT_DIRECTORY, defaultValue=".") private String outputDir; CasToInlineXml cas2xml; public void initialize(UimaContext context) throws

UIMA Ruta Creating annotation with features separated by some text

 ̄綄美尐妖づ 提交于 2019-12-07 22:29:01
问题 I have some text with annotations created like the following: wewf.werwfwef. wewfwefwwew. wefewefwff AnnotationA asdfawece aefae eafewfaefa aefafe ceaewfae adfcaecae acaeaet aegaegageg caeacdaefa AnnotationB sadaeceaee aef aewfaegg rresf ceeaefaeaeaf adfcaecae acaeaet aegaegageg caeacdaefa AnnotationA adfcaecae acaeaet aegaegageg caeacdaefa adfcaecae acaeaet aegaegageg caeacdaefa AnnotationB adfcaecae acaeaet aegaegageg caeacdaefa adfcaecae acaeaet aegaegageg caeacdaefa I want to create an

How to create pipeline of java nlp and ruta scripts?

半腔热情 提交于 2019-12-07 20:03:11
问题 I'm working on a Maven project which dynamically executes some ruta scripts to annotate some tags and process the output in java. Now that I want to use NLP (mostly dkpro) first and then pass the output to the ruta scripts (pipeline) and process further. How to achieve it ? Edited: Below is my new script; AnalysisEngineDescription pipeline = createEngineDescription(createEngineDescription(OpenNlpSegmenter.class), createEngineDescription(OpenNlpPosTagger.class), AnalysisEngineFactory

Reusable version of DKPro Core pipeline

让人想犯罪 __ 提交于 2019-12-06 16:31:27
I have set up DKPro Core as a web service to take an input and provide a tokenised output. The service itself is set up as a Jersey resource: @Path("/") public class MyResource { public MyResource() { // Nothing here } @GET public String generate(@QueryParam("q") final String input) { try { final JCasIterable en = iteratePipeline( createReaderDescription(StringReader.class, StringReader.PARAM_DOCUMENT_TEXT, input, StringReader.PARAM_LANGUAGE, "en") ,createEngineDescription(StanfordSegmenter.class) ,createEngineDescription(StanfordPosTagger.class) ,createEngineDescription(StanfordParser.class)