NLTK Named Entity recognition to a Python list

后端 未结 7 894
再見小時候
再見小時候 2020-11-28 08:14

I used NLTK\'s ne_chunk to extract named entities from a text:

my_sent = \"WASHINGTON -- In the wake of a string of abuses by New York police of         


        
相关标签:
7条回答
  • 2020-11-28 09:04

    A Tree is a list. Chunks are subtrees, non-chunked words are regular strings. So let's go down the list, extract the words from each chunk, and join them.

    >>> chunked = nltk.ne_chunk(my_sent)
    >>>
    >>>  [ " ".join(w for w, t in elt) for elt in chunked if isinstance(elt, nltk.Tree) ]
    ['WASHINGTON', 'New York', 'Loretta E. Lynch', 'Brooklyn']
    
    0 讨论(0)
提交回复
热议问题