Text search using python

雨燕双飞 提交于 2019-12-11 02:26:10

问题


I am working on a text search project, and using text blob to search for sentences from text. TextBlob pulls all the sentences with the keywords efficiently. However for effective research i also want to pull out one sentence before and one after which I am unable to figure.

Below is the code I am using:

def extraxt_sents(Text,word):
    search_words = set(word.split(','))
        sents = ''.join([s.lower() for s in Text])
        blob = TextBlob(sents)
    matches = [str(s) for s in blob.sentences if search_words & set(s.words)]
    print search_words
    print(matches)

回答1:


If you want to get the lines before and after the match, you can either create a loop and memorize the previous line, or use slices, like [from:to] on the blob.sentences list.

The best way might be to use the enumerate bultin function.

match_region = [map(str, blob.sentences[i-1:i+2])     # from prev to after next
                for i, s in enumerate(blob.sentences) # i is index, e is element
                if search_words & set(s.words)]       # same as your condition

Here, blob.sentences[i-1:i+2] will extract the sublist spanning from index i-1 (inclusive) to index i+2 (exclusive), and map turns the elements in this list into strings.

Note: Actually, you might want to replace i-1 with max(0, i-1); otherwise i-1 could be -1 and Python would interpret this as the last element, yielding an empty slice. If i+2 is higher than the list's length, on the other hand, this will not be a problem.



来源:https://stackoverflow.com/questions/24865739/text-search-using-python

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!