Filter strings in a for loop by list of words for reddit bot

删除回忆录丶 提交于 2019-12-25 08:49:38

问题


So I'm trying to write a reddit bot to find articles with certain words in the title. Here's what I have so far:

top_posts = page.hot(limit=20)
for post in top_posts:
    title = post.title
    if title.lower() in ['word1',  'word2', 'word3']:
        print(title)

If I replace the last 2 lines with...

    if 'word1' in title.lower():
        print(title)

then it'll print the titles that have word1 in them but when I put it into a list it won't. I want to use a list to match different spellings of the same word. What am I doing wrong here?


回答1:


You have the order of the operands wrongly placed and you're not doing it right.

Use any to check if any of words in the list is contained in the title:

if any(wd in title.lower() for wd in ['word1',  'word2', 'word3']):
    print(title)

To check if all of the words are contained in title, use all instead.




回答2:


title.lower() in ['word1',  'word2', 'word3']

This checks exactly what it says: Whether title.lower(), the lowercase title, is in the list of words.

This will work in cases where title is a single word, for example:

>>> title = 'Word1'
>>> title.lower() in ['word1', 'word2', 'word3']
True

But of course, this will not work when title is an actual sentence that contains multiple words. title = 'Word1 foo bar' will never be an element of that single-word list.

So, you do have to have to check for every word from your word list whether it is contained in the title string:

>>> title = 'Word1 foo bar'
>>> 'word1' in title.lower()
True
>>> 'word2' in title.lower()
False
>>> 'word3' in title.lower()
False

You could do that in a loop and break out of it as soon as you hit a positive result:

>>> def titleContainsWords(title, words):
        for word in words:
            if word in title:
                return True
        return False

>>> wordlist = ['word1', 'word2', 'word3']
>>> titleContainsWords(title.lower(), wordlist)
True

This is such a common thing, that there is also a shorter way to do the same thing, combining the any() function with generator expressions:

>>> any(word in title.lower() for word in wordlist)
True


来源:https://stackoverflow.com/questions/46285510/filter-strings-in-a-for-loop-by-list-of-words-for-reddit-bot

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!