Python extracting sentence containing 2 words

前端 未结 3 1750
逝去的感伤
逝去的感伤 2021-01-28 22:24

I have the same problem that was discussed in this link Python extract sentence containing word, but the difference is that I want to find 2 words in the same sentence. I need t

相关标签:
3条回答
  • 2021-01-28 23:18

    If this is what you mean:

    import re
    txt="I like to eat apple. Me too. Let's go buy some apples."
    define_words = 'some apple'
    print re.findall(r"([^.]*?%s[^.]*\.)" % define_words,txt)  
    
    Output: [" Let's go buy some apples."]
    

    You can also try with:

    define_words = raw_input("Enter string: ")
    

    Check if the sentence contain the defined words:

    import re
    txt="I like to eat apple. Me too. Let's go buy some apples."
    words = 'go apples'.split(' ')
    
    sentences = re.findall(r"([^.]*\.)" ,txt)  
    for sentence in sentences:
        if all(word in sentence for word in words):
            print sentence
    
    0 讨论(0)
  • 2021-01-28 23:18

    This would be simple using the TextBlob package together with Python's builtin sets.

    Basically, iterate through the sentences of your text, and check if their exists an intersection between the set of words in the sentence and your search words.

    from text.blob import TextBlob
    
    search_words = set(["buy", "apples"])
    blob = TextBlob("I like to eat apple. Me too. Let's go buy some apples.")
    matches = []
    for sentence in blob.sentences:
        words = set(sentence.words)
        if search_words & words:  # intersection
            matches.append(str(sentence))
    print(matches)
    # ["Let's go buy some apples."]
    

    Update: Or, more Pythonically,

    from text.blob import TextBlob
    
    search_words = set(["buy", "apples"])
    blob = TextBlob("I like to eat apple. Me too. Let's go buy some apples.")
    matches = [str(s) for s in blob.sentences if search_words & set(s.words)]
    print(matches)
    # ["Let's go buy some apples."]
    
    0 讨论(0)
  • 2021-01-28 23:29

    I think you want an answer using nltk. And I guess that those 2 words don't need to be consecutive right?

    >>> from nltk.tokenize import sent_tokenize, word_tokenize
    >>> text = 'I like to eat apple. Me too. Let's go buy some apples.'
    >>> words = ['like', 'apple']
    >>> sentences = sent_tokenize(text)
    >>> for sentence in sentences:
    ...   if (all(map(lambda word: word in sentence, words))):
    ...      print sentence
    ...
    I like to eat apple.
    
    0 讨论(0)
提交回复
热议问题