generally A head of a nounphrase is a noun which is rightmost of the NP as shown below tree is the head of the parent NP. So
There are built-in string to Tree
object in NLTK (, see
>>> from nltk.tree import Tree
>>> parsestr='(ROOT (S (NP (NP (DT The) (JJ old) (NN oak) (NN tree)) (PP (IN from) (NP (NNP India)))) (VP (VBD fell) (PRT (RP down)))))'
>>> for i in Tree.fromstring(parsestr).subtrees():
... if i.label() == 'NP':
... print i
(NP (DT The) (JJ old) (NN oak) (NN tree))
(PP (IN from) (NP (NNP India))))
(NP (DT The) (JJ old) (NN oak) (NN tree))
(NP (NNP India))
>>> for i in Tree.fromstring(parsestr).subtrees():
... if i.label() == 'NP':
... print i.leaves()
['The', 'old', 'oak', 'tree', 'from', 'India']
['The', 'old', 'oak', 'tree']
Note that it's not always the case that right most noun is the head noun of an NP, e.g.
>>> s = '(ROOT (S (NP (NN Carnac) (DT the) (NN Magnificent)) (VP (VBD gave) (NP ((DT a) (NN talk))))))'
>>> Tree.fromstring(s)
Tree('ROOT', [Tree('S', [Tree('NP', [Tree('NN', ['Carnac']), Tree('DT', ['the']), Tree('NN', ['Magnificent'])]), Tree('VP', [Tree('VBD', ['gave']), Tree('NP', [Tree('', [Tree('DT', ['a']), Tree('NN', ['talk'])])])])])])
>>> for i in Tree.fromstring(s).subtrees():
... if i.label() == 'NP':
... print i.leaves()[-1]
Arguably, Magnificent
can still be the head noun. Another example is when the NP includes a relative clause:
(NP (NP the person) that gave (NP the talk)) went home
The head noun of the subject is person
but the last leave node of the NP the person that gave the talk
is talk