I am newbie to Natural Language processing.I need to extract the noun phrases from the text.So far i have used open nlp\'s chunking parser for parsing my text to get the Tre
if you only want noun phrases, then use the sentence chunker rather than the tree parser. the code is something like this (you need to get the model from the same place you got the parser model)
public void chunk() {
InputStream modelIn = null;
ChunkerModel model = null;
try {
modelIn = new FileInputStream("en-chunker.bin");
model = new ChunkerModel(modelIn);
}
catch (IOException e) {
// Model loading failed, handle the error
e.printStackTrace();
}
finally {
if (modelIn != null) {
try {
modelIn.close();
}
catch (IOException e) {
}
}
}
//After the model is loaded a Chunker can be instantiated.
ChunkerME chunker = new ChunkerME(model);
String sent[] = new String[]{"Rockwell", "International", "Corp.", "'s",
"Tulsa", "unit", "said", "it", "signed", "a", "tentative", "agreement",
"extending", "its", "contract", "with", "Boeing", "Co.", "to",
"provide", "structural", "parts", "for", "Boeing", "'s", "747",
"jetliners", "."};
String pos[] = new String[]{"NNP", "NNP", "NNP", "POS", "NNP", "NN",
"VBD", "PRP", "VBD", "DT", "JJ", "NN", "VBG", "PRP$", "NN", "IN",
"NNP", "NNP", "TO", "VB", "JJ", "NNS", "IN", "NNP", "POS", "CD", "NNS",
"."};
String tag[] = chunker.chunk(sent, pos);
}
then look at the tag array for the types you want
http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#tools.parser.chunking.api
Will continue from your code itself . This program block will provide all the noun phrases in sentence. Use getTagNodes() method to get Tokens and its types
Parse topParses[] = ParserTool.parseLine(line, parser, 1);
Parse words[]=null; //an array to store the tokens
//Loop thorugh to get the tag nodes
for (Parse nodes : topParses){
words=nodes.getTagNodes(); // we will get a list of nodes
}
for(Parse word:words){
//Change the types according to your desired types
if(word.getType().equals("NN") || word.getType().equals("NNP") || word.getType().equals("NNS")){
System.out.println(word);
}
}
The Parse
object is a tree; you can use getParent()
and getChildren()
and getType()
to navigate the tree.
List<Parse> nounPhrases;
public void getNounPhrases(Parse p) {
if (p.getType().equals("NP")) {
nounPhrases.add(p);
}
for (Parse child : p.getChildren()) {
getNounPhrases(child);
}
}