How to use HeidelTime temporal tagger inside a Java project?

会有一股神秘感。 提交于 2019-12-12 10:57:59

问题


I would like to automatically identify dates inside a stream of documents and in this sense I would like to use the code provided by the open source project Heideltime, available here (https://code.google.com/p/heideltime/). I have installed the Heideltime kit (not the standalone version) and now I am wondering how can I reference it and call it inside my Java project. I have already added a dependecy to Heideltime inside my pom.xml:

    <dependency>
        <groupId>de.unihd.dbs</groupId>
        <artifactId>heideltime</artifactId>
        <version>1.7</version>
    </dependency>

however I am not sure how to call the classes from this source project into my own project. I am using Maven for both. Anyone who has used it before could maybe give me a suggestion or piece of advice? Many thanks!


回答1:


heideltime-kit is itself a Maven project. So, you can add the heideltime-kit project as a dependency. (In Netbeans, right click on Dependencies, --> Add Dependency --> Open Projects (make sure the project is open first) --> HeidelTime)

Then move the config.props file into your project's src/main/resources folder. Set the path to treetagger within config.props.

As far as using the classes goes, you'll want to create an instance of HeidelTimeStandalone (see de.unihd.dbs.heideltime.standalone.HeidelTimeStandalone.java) using POSTagger.TREETAGGER as the posTagger parameter and a hardcoded path to your src/main/resources/config.props file as the configPath parameter. For example,

heidelTime = new HeidelTimeStandalone(Language.ENGLISH,
                                      DocumentType.COLLOQUIAL,
                                      OutputType.TIMEML,
                                      "path/to/config.props",
                                      POSTagger.TREETAGGER, true);

Then to use HeidelTime to process text, you can simply call the process function:

String result = heidelTime.process(text, date);



回答2:


This library is not in the maven central repository yet. (You can check this in this search.maven.org site.)

To use the library in your project. You should download the JAR file and install it locally. Refer this questions answer: How to add local jar files in maven project? .

Then you can just use the import package and use the functionality in your project.




回答3:


Adding to the reply from jgloves, you might be interested to parse the Heideltime result string into a Java object representation. The following code transforms the Uima-XML representation into Timex3 objects.

    HeidelTimeStandalone time = new HeidelTimeStandalone(Language.GERMAN, DocumentType.SCIENTIFIC, OutputType.XMI, "config.props", POSTagger.STANFORDPOSTAGGER);
    String xmiRepresentation = time.process(document, documentCreationTime); //Apply Heideltime and get the XML-UIMA representation     
    JCas cas = jcasFactory.createJCas();

    for(FSIterator<Annotation> it= cas.getAnnotationIndex(Timex3.type).iterator(); it.hasNext(); ){
            System.out.printkn(it.next);
    }


来源:https://stackoverflow.com/questions/27337268/how-to-use-heideltime-temporal-tagger-inside-a-java-project

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!