Negation handling in sentiment analysis

前端 未结 3 1319
走了就别回头了
走了就别回头了 2021-02-09 04:32

I am in need of a little help here, I need to identify the negative words like \"not good\",\"not bad\" and then identify the polarity (negative or positive) of the sentiment. I

相关标签:
3条回答
  • 2021-02-09 05:02

    Negation handling is quite a broad field, with numerous different potential implementations. Here I can provide sample code that negates a sequence of text and stores negated uni/bi/trigrams in not_ form. Note that nltk isn't used here in favor of simple text processing.

    # negate_sequence(text)
    #   text: sentence to process (creation of uni/bi/trigrams
    #    is handled here)
    #
    # Detects negations and transforms negated words into 'not_' form
    #
    def negate_sequence(text):
        negation = False
        delims = "?.,!:;"
        result = []
        words = text.split()
        prev = None
        pprev = None
        for word in words:
            stripped = word.strip(delims).lower()
            negated = "not_" + stripped if negation else stripped
            result.append(negated)
            if prev:
                bigram = prev + " " + negated
                result.append(bigram)
                if pprev:
                    trigram = pprev + " " + bigram
                    result.append(trigram)
                pprev = prev
            prev = negated
    
            if any(neg in word for neg in ["not", "n't", "no"]):
                negation = not negation
    
            if any(c in word for c in delims):
                negation = False
    
        return result
    

    If we run this program on a sample input text = "I am not happy today, and I am not feeling well", we obtain the following sequences of unigrams, bigrams, and trigrams:

    [   'i',
        'am',
        'i am',
        'not',
        'am not',
        'i am not',
        'not_happy',
        'not not_happy',
        'am not not_happy',
        'not_today',
        'not_happy not_today',
        'not not_happy not_today',
        'and',
        'not_today and',
        'not_happy not_today and',
        'i',
        'and i',
        'not_today and i',
        'am',
        'i am',
        'and i am',
        'not',
        'am not',
        'i am not',
        'not_feeling',
        'not not_feeling',
        'am not not_feeling',
        'not_well',
        'not_feeling not_well',
        'not not_feeling not_well']
    

    We may subsequently store these trigrams in an array for future retreival and analysis. Process the not_ words as negative of the [sentiment, polarity] that you have defined for their counterparts.

    0 讨论(0)
  • 2021-02-09 05:09

    It's been a while since I've worked on sentiment analysis, so not sure what the status of this area is now, and in any case I have never used nltk for this. So I wouldn't be able to point you to anything there. But in general, I think it's safe to say that this is an active area of research and an essential part of NLP. And that surely it isn't a problem that has been 'solved' yet. It's one of the finer, more interesting fields of NLP, involving irony, sarcams, scope (of negations). Often, coming up with a correct analysis means interpreting a lot of context/domain/discourse information. Which isn't straightforward at all. You may want to look at this topic: Can an algorithm detect sarcasm. And some googling will probably give you a lot more information.

    In short; your question is way too broad to come up with a specific answer.

    Also, I wonder what you mean with "I did everything except handling the negations". You mean you identified 'negative' words? Have you considered that this information can be conveyed in a lot more than the words not, no, etc? Consider for example "Your solution was not good" vs. "Your solution was suboptimal". What exactly you are looking for, and what will suffice in your situation, obivously depends on context and domain of application. This probably wasn't the answer you were hoping for, but I'd suggest you do a bit more research (as a lot of smart things have been done by smart people in this field).

    0 讨论(0)
  • 2021-02-09 05:24

    this seems to be working decently well as a poor man's word negation in python. it's definitely not perfect, but may be useful for some cases. it takes a spacy sentence object.

    def word_is_negated(word):
        """ """
    
        for child in word.children:
            if child.dep_ == 'neg':
                return True
    
        if word.pos_ in {'VERB'}:
            for ancestor in word.ancestors:
                if ancestor.pos_ in {'VERB'}:
                    for child2 in ancestor.children:
                        if child2.dep_ == 'neg':
                            return True
    
        return False
    
    def find_negated_wordSentIdxs_in_sent(sent, idxs_of_interest=None):
        """ """
    
        negated_word_idxs = set()
        for word_sent_idx, word in enumerate(sent):
            if idxs_of_interest:
                if word_sent_idx not in idxs_of_interest:
                    continue
            if word_is_negated(word):
                negated_word_idxs.add(word_sent_idx)
    
        return negated_word_idxs
    

    call it like this:

    import spacy
    nlp = spacy.load('en_core_web_lg')
    find_negated_wordSentIdxs_in_sent(nlp("I have hope, but I do not like summer"))
    

    EDIT: As @Amandeep pointed out, depending on your use case, you may also want to include NOUNS, ADJECTIVES, ADVERBS in the line: if word.pos_ in {'VERB'}:.

    0 讨论(0)
提交回复
热议问题