问题
I'm trying to get the top 25 of all time of a given subreddit using PRAW:
import praw
subreddit = 'gamedeals'
r = praw.Reddit(user_agent='getting top 25 of all time by /u/sqrg')
submissions = r.get_subreddit(subreddit).get_top_from_all(limit=25)
titlesFile = open("text.txt", 'w')
for s in submissions:
titlesFile.write(s.title.encode('utf-8', 'replace') + '\n')
titlesFile.close()
I get the following error:
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 63: ordinal not in range(128)
So I changed the line inside the for loop to:
titlesFile.write(s.title.encode('utf-8', 'replace') + '\n')
And it works, but in the text.txt file I get &
instead of &
. I could change them with some string replace function, but is there any way to directly write the correct title? Also, why did I have to use the encode()
method?
回答1:
Enable the setting to decode html entities:
r = praw.Reddit(user_agent='getting top 25 of all time by /u/sqrg')
r.config.decode_html_entities = True
Config file docs: https://praw.readthedocs.org/en/latest/pages/configuration_files.html
More info here: https://github.com/praw-dev/praw/issues/186
来源:https://stackoverflow.com/questions/19759358/getting-amp-instead-of-in-title-return-using-praw