Parse raw text with MaltParser in Java

℡╲_俬逩灬. 提交于 2019-12-07 18:43:39

问题


I found that NLKT in python does it via *raw_parse* function but I need to use Java. I found cleartk has a MaltParser wrapper but there is no documentation about it. I'm looking for a function or a project that first converts raw English text to conll file that MaltParser can use and parses it with MaltParser. Any help is appreciated.


回答1:


There are examples coming with the MaltParser 1.7.2 distribution in the folder examples/apiexamples/srcex.

However, these examples only show how to run the MaltParser programmatically after tokenization and pos-tagging have already been performed (and after the output of these steps has been converted to a CONLL-like format).

Since I currently cannot offer a better (simpler/shorter) alternative, at least I could share with you a link to a Groovy script which performs tokenization, part-of-speech tagging (using OpenNLP) and dependency parsing (using MaltParser). The tools are made interoperable using UIMA. If one is familiar with Maven, it should be quite straight forward to derive a Java version of that script.

Mind, this is not the best answer, but at this point possibly better than nothing.

Note: I'm a developer on both, Apache UIMA and DKPro Core (the project to which the link points).



来源:https://stackoverflow.com/questions/17392790/parse-raw-text-with-maltparser-in-java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!