问题
So let's say that I want to stream posts from the subreddit "news". However the posts are very frequent and we can't say that every post is worthy. So I would like to filter the good posts by trying to stream the "hot" list. But I am not sure if that, or a similar thing is possible.
Normally, this is what I do to stream posts:for submission in subreddit.stream.submissions():
if not submission.stickied:
print(str(submission.title) + " " + str(submission.url) + "\n")
And this would filter the posts, but not stream it:
for submission in subreddit.hot(limit=10):
print(str(submission.title) + " " + str(submission.url) + "\n")
So, any ideas about how I could stream and filter posts at the same time?
Thanks
回答1:
Streaming hot posts is an incongruous idea.
The point of a stream in PRAW is to get each post or comment (almost) immediately after it is submitted to Reddit. The hot listing, on the other hand, contains the items which are deemed to be currently interesting, ordered by a score which is somewhat proportional to points divided by age.
However the posts are very frequent and we can't say that every post is worthy.
Because it takes time for Reddit users to see posts and vote on them, it doesn't make much sense to evaluate whether a post is worthy, as measured by score, immediately after it is posted.
If your goal is to perform some action on every posts that makes it into the top n of a subreddit, you could check the front page on a certain interval, performing your action for any post you haven't already seen. As an example:
import praw
import time
reddit = praw.Reddit() # must be edited to properly authenticate
subreddit = reddit.subreddit('news')
seen_submissions = set()
while True:
for submission in subreddit.hot(limit=10):
if submission.fullname not in seen_submissions:
seen_submissions.add(submission.fullname)
print('{} {}\n'.format(submission.title, submission.url))
time.sleep(60) # sleep for a minute (60 seconds)
回答2:
To add to jarhill0's answer, you can also paginate pages by specifying "after" in the params.
import praw
import time
reddit = praw.Reddit() # must be edited to properly authenticate
subreddit = reddit.subreddit('news')
seen_submissions = set()
while True:
params = None
for _ in range(10):# get first 10 pages of 'hot'.
for submission in subreddit.hot(limit=10, params=params):
if submission.fullname not in seen_submissions:
seen_submissions.add(submission.fullname)
print('{} {}\n'.format(submission.title, submission.url))
params = {"after": submission.fullname}
time.sleep(60) # sleep for a minute (60 seconds)
来源:https://stackoverflow.com/questions/50500360/can-you-stream-posts-that-have-made-it-to-hot