问题
I am going to use Stanford Corenlp 2013 to find phrase heads. I saw this thread.
But, the answer was not clear to me and I couldn't add any comment to continue that thread. So, I'm sorry for duplication.
What I have at the moment is the parse tree of a sentence (using Stanford Corenlp) (I also tried with CONLL format which is created by Stanford Corenlp). And what I need is exactly the head of noun phrases.
I don't know how I can use dependencies and the parse tree to extract heads of nounphrases.
What I know is that if I have nsubj (x, y)
, y is the head of the subject. If I have dobj(x,y)
, y is the head of the direct object. f I have iobj(x,y)
, y is the head of the indirect object.
However, I am not sure if this way is the correct way to find all phrase heads. If it is, which rules I should add to get all heads of noun phrases?
Maybe, it is worth saying that I need the heads of noun phrases in a java code.
回答1:
Since I couldnt comment on the answer given by Chaitanya, adding more to his answer here.
Stanford CoreNLP suite has implementation of Collins head finder heuristics and a semantic head finder heuristic in the form of
- CollinsHeadFinder
- ModCollinsHeadFinder
- SemanticHeadFinder
All you would need is instantiate one of the three and do the following.
Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
headFinder.determineHead(tree).pennPrint(out);
You can iterate through the nodes of the tree and determine head words wherever required.
PS: My answer is based on the StanfordCoreNLP suite released as of 20140104.
Here is a simple dfs that lets you extract head words for all noun phrases in a sentence
public static void dfs(Tree node, Tree parent, HeadFinder headFinder) {
if (node == null || node.isLeaf()) {
return;
}
//if node is a NP - Get the terminal nodes to get the words in the NP
if(node.value().equals("NP") ) {
System.out.println(" Noun Phrase is ");
List<Tree> leaves = node.getLeaves();
for(Tree leaf : leaves) {
System.out.print(leaf.toString()+" ");
}
System.out.println();
System.out.println(" Head string is ");
System.out.println(node.headTerminal(headFinder, parent));
}
for(Tree child : node.children()) {
dfs(child, node, headFinder);
}
}
回答2:
You could extract the phrase of interest such that it is an object of the class Tree You can then use determineHead(Tree t) method from any of the classes that implement the interface HeadFinder.
来源:https://stackoverflow.com/questions/19431754/using-stanford-parsercorenlp-to-find-phrase-heads