I\'m trying to extract triplet subject, predicate, and object from sentence. I need more references on how to do this.
I'm working on a similar problem, i am working in visual basic. Firstly : I have a list of Subjects / NOUNS Secondly : when i extract the predicate i extract the between phrase...
(a cat) (Sat on ) (The mat)
by building the Subject list with nouns and noun phrases their positions can be replaced with (learning pattern) then if the subjects are not detected the learned predicate may have previously been detected.
Perhaps this is similar to the snowball Algorithm.
The most basic way to do this, with acceptable result is to do shallow parsing and then extracting NOUN-VERB-NOUN triples. This should work for all SVO (subject–verb–object) languages like English. Some tuning may be required to extract only the first triple from a sentence, or not extract in case of comas. It is a very fast solution, because shallow POS tagging usually is O(n) - 0.01 per sentence, instead of deep parsing(Open NLP, Stanford Parser) which is O(n^3) - 0.4 sec per sentence.
you can use Stanford parser API or Open NLP to make part of speech tagging and some other NLP operations
and for the triplet extraction you can implement one of the techniques in the papers available on the internet , i know a good one to implement : http://ailab.ijs.si/delia_rusu/Papers/is_2007.pdf