问题
I am trying to extract Arabic proper names from a text using Stanford Parser.
for example if I have an input sentence:
تكريم سعد الدين الشاذلى
using the Arabic Stanford parser, the tree diagram will be:
(ROOT (NP (NN تكريم) (NP (NNP سعد) (DTNNP الدين) (NNP الشاذلى))))
I want to extract the proper name:
سعد الدين الشاذلى
which have the sub-tree:
(NP (NNP سعد) (DTNNP الدين) (NNP الشاذلى))
I have tried this: similar question
but there is some thing wrong in this line:
List<TaggedWord> taggedWords = (Tree) lp.apply(str);
the error in putting a tree type in a list of taggedword
another thing that I didnot understand that where could i use the suggested taggedYield()
function
Any Ideas, please?
回答1:
This is pretty basic Java with respect to the library, but what you want is:
Tree tree = lp.apply(str);
List<TaggedWord> taggedWords = tree.taggedYield();
for (TaggedWord tw : taggedWords) {
if (tw.tag().contains("NNP")) {
System.err.println(tw.word());
}
}
来源:https://stackoverflow.com/questions/6505569/extracting-arabic-proper-names-from-a-text-using-stanford-parser