Python POST request encoding

百般思念 提交于 2019-12-11 10:23:27

问题


here's the situation, i'm sending POST requests and trying to fetch the response with Python problem is that it distorts non latin letters, which doesn't happen when i fetch the same page with direct link (with no search results), but POST requests wont generate a link

here's what i do:

import urllib
import urllib2
url = 'http://donelaitis.vdu.lt/main_helper.php?id=4&nr=1_2_11'
data = 'q=bus&ieskoti=true&lang1=en&lang2=en+-%3E+lt+%28+71813+lygiagre%C4%8Di%C5%B3+sakini%C5%B3+%29&lentele=vertikalus&reg=false&rodyti=dalis&rusiuoti=freq' 
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
the_page = response.read()
file = open("pagesource.txt", "w")
file.write(the_page)
file.close()

whenever i try

thepage = the_page.encode('utf-8')

i get this error:

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 1008: ordinal not in range(128)

whenever i try do change response header Content-Type:text/html;charset=utf-8, i do

response['Content-Type'] = 'text/html;charset=utf-8'

i get this error:

AttributeError: addinfourl instance has no attribute '__setitem__'

My question: is it possible to edit or remove response or request headers? if not, is there another way to solve this problem other that copying source to notepad++ and fixing encoding manually?

i'm new to python and data mining, really hope you'd let me know if i;m doing something wrong

thanks


回答1:


Why don't your try thepage = the_page.decode('utf-8')instead of encode since what you want is to move from utf-8 encoded text to unicode - coding agnostic - internal strings?




回答2:


Two things. Firstly, you don't want to encode the response, you want to decode it:

thepage = the_page.decode('utf-8')

And secondly, you don't want to set the header on the response, you set it on the request, using the add_header method:

req.add_header('Content-Type', 'text/html;charset=utf-8')


来源:https://stackoverflow.com/questions/9464083/python-post-request-encoding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!