Negation handling in sentiment analysis

前端 未结 3 1326
走了就别回头了
走了就别回头了 2021-02-09 04:32

I am in need of a little help here, I need to identify the negative words like \"not good\",\"not bad\" and then identify the polarity (negative or positive) of the sentiment. I

3条回答
  •  心在旅途
    2021-02-09 05:02

    Negation handling is quite a broad field, with numerous different potential implementations. Here I can provide sample code that negates a sequence of text and stores negated uni/bi/trigrams in not_ form. Note that nltk isn't used here in favor of simple text processing.

    # negate_sequence(text)
    #   text: sentence to process (creation of uni/bi/trigrams
    #    is handled here)
    #
    # Detects negations and transforms negated words into 'not_' form
    #
    def negate_sequence(text):
        negation = False
        delims = "?.,!:;"
        result = []
        words = text.split()
        prev = None
        pprev = None
        for word in words:
            stripped = word.strip(delims).lower()
            negated = "not_" + stripped if negation else stripped
            result.append(negated)
            if prev:
                bigram = prev + " " + negated
                result.append(bigram)
                if pprev:
                    trigram = pprev + " " + bigram
                    result.append(trigram)
                pprev = prev
            prev = negated
    
            if any(neg in word for neg in ["not", "n't", "no"]):
                negation = not negation
    
            if any(c in word for c in delims):
                negation = False
    
        return result
    

    If we run this program on a sample input text = "I am not happy today, and I am not feeling well", we obtain the following sequences of unigrams, bigrams, and trigrams:

    [   'i',
        'am',
        'i am',
        'not',
        'am not',
        'i am not',
        'not_happy',
        'not not_happy',
        'am not not_happy',
        'not_today',
        'not_happy not_today',
        'not not_happy not_today',
        'and',
        'not_today and',
        'not_happy not_today and',
        'i',
        'and i',
        'not_today and i',
        'am',
        'i am',
        'and i am',
        'not',
        'am not',
        'i am not',
        'not_feeling',
        'not not_feeling',
        'am not not_feeling',
        'not_well',
        'not_feeling not_well',
        'not not_feeling not_well']
    

    We may subsequently store these trigrams in an array for future retreival and analysis. Process the not_ words as negative of the [sentiment, polarity] that you have defined for their counterparts.

提交回复
热议问题