Getting & instead of & in title return using PRAW

时光总嘲笑我的痴心妄想 提交于 2019-12-13 17:10:02

问题


I'm trying to get the top 25 of all time of a given subreddit using PRAW:

import praw
subreddit = 'gamedeals'
r = praw.Reddit(user_agent='getting top 25 of all time by /u/sqrg')
submissions = r.get_subreddit(subreddit).get_top_from_all(limit=25)
titlesFile = open("text.txt", 'w')
for s in submissions:
    titlesFile.write(s.title.encode('utf-8', 'replace') + '\n')
titlesFile.close()

I get the following error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' in position 63: ordinal not in range(128)

So I changed the line inside the for loop to:

titlesFile.write(s.title.encode('utf-8', 'replace') + '\n')

And it works, but in the text.txt file I get & instead of &. I could change them with some string replace function, but is there any way to directly write the correct title? Also, why did I have to use the encode() method?


回答1:


Enable the setting to decode html entities:

r = praw.Reddit(user_agent='getting top 25 of all time by /u/sqrg')
r.config.decode_html_entities = True

Config file docs: https://praw.readthedocs.org/en/latest/pages/configuration_files.html

More info here: https://github.com/praw-dev/praw/issues/186



来源:https://stackoverflow.com/questions/19759358/getting-amp-instead-of-in-title-return-using-praw

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!