opennlp

Number of parameters must be always be even : opennlp

给你一囗甜甜゛ 提交于 2019-12-19 10:32:08
问题 I've been trying to use the command Line interface to train my model like this: opennlp TokenNameFinderTrainer -model en-ner-pincode.bin -iterations 500 \ -lang en -data en-ner-pincode.train -encoding UTF-8 the console output is: Number of parameters must be always be even Usage: opennlp TokenNameFinderTrainer[.evalita|.ad|.conll03|.bionlp2004|.conll02|.muc6|.ontonotes|.brat] [-factory factoryName] [-resources resourcesDir] [-type modelType] [-featuregen featuregenFile] [-nameTypes types] [

Training models using openNLP maxent

妖精的绣舞 提交于 2019-12-19 04:02:00
问题 I have gold data where I annotated all room numbers from several documents. I want to use openNLP to train a model that uses this data and classify room numbers. I am stuck on where to start. I read openNLP maxent documentation, looked at examples in opennlp.tools and now looking at opennlp.tools.ml.maxent - it seems like it is something what I should be using, but still I have no idea on how to use. Can somebody give me some basic idea on how to use openNLP maxent and where to start with?

Visualize Parse Tree Structure

萝らか妹 提交于 2019-12-18 16:46:08
问题 I would like to display the parsing (POS tagging) from openNLP as a tree structure visualization. Below I provide the parse tree from openNLP but I can not plot as a visual tree common to Python's parsing. install.packages( "http://datacube.wu.ac.at/src/contrib/openNLPmodels.en_1.5-1.tar.gz", repos=NULL, type="source" ) library(NLP) library(openNLP) x <- 'Scroll bar does not work the best either.' s <- as.String(x) ## Annotators sent_token_annotator <- Maxent_Sent_Token_Annotator() word_token

How I train an Named Entity Recognizer identifier in OpenNLP?

*爱你&永不变心* 提交于 2019-12-18 10:54:08
问题 Ok, I have the following code to train the NER Identifier from OpenNLP FileReader fileReader = new FileReader("train.txt"); ObjectStream fileStream = new PlainTextByLineStream(fileReader); ObjectStream sampleStream = new NameSampleDataStream(fileStream); TokenNameFinderModel model = NameFinderME.train("pt-br", "train", sampleStream, Collections.<String, Object>emptyMap()); nfm = new NameFinderME(model); I don't know if I'm doing something wrong of if something is missing, but the classifying

OpenNLP: foreign names does not get recognized

霸气de小男生 提交于 2019-12-18 08:27:31
问题 I just started using openNLP to recognize names. I am using the model (en-ner-person.bin) that comes with open NLP. I noticed that while it recognizes us, uk, and european names, it fails to recognize Indian or Japanese names. My questions are (1) is there already models available that I can use to recognize foreign names (2) If not, then I believe I will need to generate new models. In that case, is there a copora available that I can use? 回答1: You can make your own model with your data

How many lines and documents should be there in the training data opennlp categorizer

无人久伴 提交于 2019-12-14 03:59:13
问题 I am following the documentation for Apache open-nlp. I was able to understand the sentence detection, tokenizer, name-finder. But I got stuck for Categorizer. The reason, I can not understand, how to create a model for Categorization. I do understand that I need to create a file. The format is very clear, it needs to be a category space and a document in a single line. Save the file with .train extension. So I created the following file: Refund What is the refund status for my order #342 ?

Java OpenNLP extract all nouns from a sentence

冷暖自知 提交于 2019-12-13 07:22:33
问题 I am using Java8 and OpenNLP. I am trying to extract all noun words from sentences. I have tried this example, but it extracts all noun phrases ("NP"). Does anyone know how I can just extract the individual noun words? Thanks 回答1: What have you tried so far? I haven't looked at the example you link to in a lot of detail, but I'm pretty sure that you could get where you want to with modifying that example. In any case, it's not very difficult: InputStream modelIn = null; POSModel POSModel =

How to extract elements from NLP Tree?

删除回忆录丶 提交于 2019-12-12 09:57:31
问题 I am using the NLP package to parse sentences. How can I extract an element from the Tree output that is created? For example I'd like to grab the Noun Phrases ( NP ) from the example below: library(NLP) library(openNLP) s <- c( "Really, I like chocolate because it is good.", "Robots are rather evil and most are devoid of decency" ) s <- as.String(s) sent_token_annotator <- Maxent_Sent_Token_Annotator() word_token_annotator <- Maxent_Word_Token_Annotator() a2 <- annotate(s, list(sent_token

How can I view the content of a .bin file in opennlp

拜拜、爱过 提交于 2019-12-11 22:31:37
问题 I am trying to use OpenNLP in a project I am working in and i am very new to it. I tried out using the Named Entity Recognition with the training data available at http://opennlp.sourceforge.net/models-1.5/ However I want to see the training data that have been used. i.e. to actually open the .bin file and see its content in English. Can some one pls point me in the correct direction. I have tried to use UltraISO to read the .bin file but i was not successful. PLs help !! Thanx :) 回答1: Use

Manipulating query using Custom Query Parser in Solr

心已入冬 提交于 2019-12-11 15:48:59
问题 I have tried to create a CustomQueryParser where I am making use of OpenNLP libraries as well. My objective is if i have a query "How many defective rims are causing failure in ABC tyres in China" I want the final query to be something like "defective rims failure tyres China" which then would go to the Analyzer for further processing. This is my code for QueryParserPlugin - package com.mycompany.lucene.search; import org.apache.solr.common.params.SolrParams; import org.apache.solr.request