How do you send a HEAD HTTP request in Python 2?

前端 未结 11 1993
甜味超标
甜味超标 2020-11-22 11:40

What I\'m trying to do here is get the headers of a given URL so I can determine the MIME type. I want to be able to see if http://somedomain/foo/ will return a

相关标签:
11条回答
  • 2020-11-22 11:41

    edit: This answer works, but nowadays you should just use the requests library as mentioned by other answers below.


    Use httplib.

    >>> import httplib
    >>> conn = httplib.HTTPConnection("www.google.com")
    >>> conn.request("HEAD", "/index.html")
    >>> res = conn.getresponse()
    >>> print res.status, res.reason
    200 OK
    >>> print res.getheaders()
    [('content-length', '0'), ('expires', '-1'), ('server', 'gws'), ('cache-control', 'private, max-age=0'), ('date', 'Sat, 20 Sep 2008 06:43:36 GMT'), ('content-type', 'text/html; charset=ISO-8859-1')]
    

    There's also a getheader(name) to get a specific header.

    0 讨论(0)
  • 2020-11-22 11:47

    I believe the Requests library should be mentioned as well.

    0 讨论(0)
  • 2020-11-22 11:47

    Probably easier: use urllib or urllib2.

    >>> import urllib
    >>> f = urllib.urlopen('http://google.com')
    >>> f.info().gettype()
    'text/html'
    

    f.info() is a dictionary-like object, so you can do f.info()['content-type'], etc.

    http://docs.python.org/library/urllib.html
    http://docs.python.org/library/urllib2.html
    http://docs.python.org/library/httplib.html

    The docs note that httplib is not normally used directly.

    0 讨论(0)
  • 2020-11-22 11:48

    And yet another approach (similar to Pawel answer):

    import urllib2
    import types
    
    request = urllib2.Request('http://localhost:8080')
    request.get_method = types.MethodType(lambda self: 'HEAD', request, request.__class__)
    

    Just to avoid having unbounded methods at instance level.

    0 讨论(0)
  • 2020-11-22 11:52

    As an aside, when using the httplib (at least on 2.5.2), trying to read the response of a HEAD request will block (on readline) and subsequently fail. If you do not issue read on the response, you are unable to send another request on the connection, you will need to open a new one. Or accept a long delay between requests.

    0 讨论(0)
  • 2020-11-22 11:54

    I have found that httplib is slightly faster than urllib2. I timed two programs - one using httplib and the other using urllib2 - sending HEAD requests to 10,000 URL's. The httplib one was faster by several minutes. httplib's total stats were: real 6m21.334s user 0m2.124s sys 0m16.372s

    And urllib2's total stats were: real 9m1.380s user 0m16.666s sys 0m28.565s

    Does anybody else have input on this?

    0 讨论(0)
提交回复
热议问题