python nltk keyword extraction from sentence

后端 未结 3 1034
情话喂你
情话喂你 2021-02-10 03:09

\"First thing we do, let\'s kill all the lawyers.\" - William Shakespeare

Given the quote above, I would like to pull out \"k

3条回答
  •  孤城傲影
    2021-02-10 03:42

    One simple approach would be to keep stop word lists for NN, VB etc. These would be high frequency words that usually don't add much semantic content to a sentence.

    The snippet below shows distinct lists for each type of word token, but you could just as well employ a single stop word list for both verbs and nouns (such as this one).

    stop_words = dict(
        NNP=['first', 'second'],
        NN=['thing'],
        VBP=['do','done'],
        VB=[],
        NNS=['lets', 'things'],
    )
    
    
    def filter_stop_words(pos_list):
        return [[token, token_type] 
                for token, token_type in pos_list 
                if token.lower() not in stop_words[token_type]]
    

提交回复
热议问题