python httplib/urllib get filename

这一生的挚爱 提交于 2019-11-28 10:09:31

To get filename from response http headers:

import cgi

response = urllib2.urlopen(URL)
_, params = cgi.parse_header(response.headers.get('Content-Disposition', ''))
filename = params['filename']

To get filename from the URL:

import posixpath
import urlparse 

path = urlparse.urlsplit(URL).path
filename = posixpath.basename(path)

Does not make much sense what you are asking. The only thing that you have is the URL. Either extract the last part from the URL or you may check the HTTP response for something like

content-disposition: attachment;filename="foo.bar"

This header can be set by the server to indicate that the filename is foo.bar. This is usually used for file downloads or something similar.

gmlime

I searched for you question on google and I saw that it was answered in stackoverflow before I believe.

Try looking at this post:

Using urllib2 in Python. How do I get the name of the file I am downloading?

The filename is usually included by the server through the content-disposition header:

content-disposition: attachment; filename=foo.pdf

You have access to the headers through

result = urllib2.urlopen(...)
result.info() <- contains the headers


i>>> import urllib2
ur>>> result = urllib2.urlopen('http://zopyx.com')
>>> print result
<addinfourl at 4302289808 whose fp = <socket._fileobject object at 0x1006dd5d0>>
>>> result.info()
<httplib.HTTPMessage instance at 0x1006fbab8>
>>> result.info().headers
['Date: Mon, 04 Apr 2011 02:08:28 GMT\r\n', 'Server: Zope/(unreleased version, python 2.4.6, linux2) ZServer/1.1

Plone/3.3.4\r\n', 'Content-Length: 15321\r\n', 'Content-Type: text/html; charset=utf-8\r\n', 'Via: 1.1 www.zopyx.com\r\n', 'Cache-Control: max-age=3600\r\n', 'Expires: Mon, 04 Apr 2011 03:08:28 GMT\r\n', 'Connection: close\r\n']

See

http://docs.python.org/library/urllib2.html

Use urllib.request.Request:

import urllib

req = urllib.request.Request(url, method='HEAD')
r = urllib.request.urlopen(req)
print(r.info().get_filename())

Example :

In[1]: urllib.request.urlopen(urllib.request.Request('https://httpbin.org/response-headers?content-disposition=%20attachment%3Bfilename%3D%22example.csv%22', method='HEAD')).info().get_filename()
Out[1]: 'example.csv'
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!