问题
I would like to make a POST request to upload a file to a web service (and get response) using python. For example, I can do the following POST request with curl
:
curl -F "file=@style.css" -F output=json http://jigsaw.w3.org/css-validator/validator
How can I make the same request with python urllib/urllib2? The closest I got so far is the following:
with open("style.css", 'r') as f:
content = f.read()
post_data = {"file": content, "output": "json"}
request = urllib2.Request("http://jigsaw.w3.org/css-validator/validator", \
data=urllib.urlencode(post_data))
response = urllib2.urlopen(request)
I got a HTTP Error 500 from the code above. But since my curl
command succeeds, it must be something wrong with my python request?
I am quite new to this topic and please forgive me if the rookie question has very simple answers or mistakes. Thanks in advance for all your helps!
回答1:
Personally I think you should consider the requests library to post files.
url = 'http://jigsaw.w3.org/css-validator/validator'
files = {'file': open('style.css')}
response = requests.post(url, files=files)
Uploading files using urllib2
is not impossible but quite a complicated task: http://pymotw.com/2/urllib2/#uploading-files
回答2:
After some digging around, it seems this post solved my problem. It turns out I need to have the multipart encoder setup properly.
from poster.encode import multipart_encode
from poster.streaminghttp import register_openers
import urllib2
register_openers()
with open("style.css", 'r') as f:
datagen, headers = multipart_encode({"file": f})
request = urllib2.Request("http://jigsaw.w3.org/css-validator/validator", \
datagen, headers)
response = urllib2.urlopen(request)
回答3:
Well, there are multiple ways to do it. As mentioned above, you can send the file in "multipart/form-data". However, the target service may not be expecting this type, in which case you may try some more approaches.
Pass the file object
urllib2 can accept a file object as data
. When you pass this type, the library reads the file as a binary stream and sends it out. However, it will not set the proper Content-Type
header. Moreover, if the Content-Length
header is missing, then it will try to access the len
property of the object, which doesn't exist for the files. That said, you must provide both the Content-Type
and the Content-Length
headers to have the method working:
import os
import urllib2
filename = '/var/tmp/myfile.zip'
headers = {
'Content-Type': 'application/zip',
'Content-Length': os.stat(filename).st_size,
}
request = urllib2.Request('http://localhost', open(filename, 'rb'),
headers=headers)
response = urllib2.urlopen(request)
Wrap the file object
To not deal with the length, you may create a simple wrapper object. With just a little change you can adapt it to get the content from a string if you have the file loaded in memory.
class BinaryFileObject:
"""Simple wrapper for a binary file for urllib2."""
def __init__(self, filename):
self.__size = int(os.stat(filename).st_size)
self.__f = open(filename, 'rb')
def read(self, blocksize):
return self.__f.read(blocksize)
def __len__(self):
return self.__size
Encode the content as base64
Another way is encoding the data
via base64.b64encode
and providing Content-Transfer-Type: base64
header. However, this method requires support on the server side. Depending on the implementation, the service can either accept the file and store it incorrectly, or return HTTP 400
. E.g. the GitHub API won't throw an error, but the uploaded file will be corrupted.
来源:https://stackoverflow.com/questions/27050399/make-an-http-post-request-to-upload-a-file-using-python-urllib-urllib2