In Python, how do I decode GZIP encoding?

后端 未结 8 1664
星月不相逢
星月不相逢 2020-11-28 07:53

I downloaded a webpage in my python script. In most cases, this works fine.

However, this one had a response header: GZIP encoding, and when I tried to print the sou

相关标签:
8条回答
  • 2020-11-28 08:43

    I use something like that:

    f = urllib2.urlopen(request)
    data = f.read()
    try:
        from cStringIO import StringIO
        from gzip import GzipFile
        data2 = GzipFile('', 'r', 0, StringIO(data)).read()
        data = data2
    except:
        #print "decompress error %s" % err
        pass
    return data
    
    0 讨论(0)
  • 2020-11-28 08:48

    Similar to Shatu's answer for python3, but arranged a little differently:

    import gzip
    
    s = Request("https://someplace.com", None, headers)
    r = urlopen(s, None, 180).read()
    try: r = gzip.decompress(r)
    except OSError: pass
    result = json_load(r.decode())
    

    This method allows for wrapping the gzip.decompress() in a try/except to capture and pass the OSError that results in situations where you may get mixed compressed and uncompressed data. Some small strings actually get bigger if they are encoded, so the plain data is sent instead.

    0 讨论(0)
提交回复
热议问题