Apache HttpClient response content length returns -1

无人久伴 提交于 2020-05-29 03:49:33

问题


Why does the following Code returns -1? Seems that the request failed.

public static void main(String[] args)
{
    DefaultHttpClient httpClient = new DefaultHttpClient();
    HttpGet httpGet = new HttpGet("http://www.google.de");

    HttpResponse response;
    try
    {
        response = httpClient.execute(httpGet);
        HttpEntity entity = response.getEntity();
        EntityUtils.consume(entity);

        // Prints -1
        System.out.println(entity.getContentLength());
    }
    catch (ClientProtocolException e)
    {
        e.printStackTrace();
    }
    catch (IOException e)
    {
        e.printStackTrace();
    }
    finally
    {
        httpGet.releaseConnection();
    }
}

And is it possible to get the response as String?


回答1:


Try running

Header[] headers = response.getAllHeaders();
for (Header header : headers) {
    System.out.println(header);
}

It will print

Date: Tue, 10 Sep 2013 19:10:04 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=dad7e2356ddb3b7a:FF=0:TM=1378840204:LM=1378840204:S=vQcLzVPbOOTxfvL4; expires=Thu, 10-Sep-2015 19:10:04 GMT; path=/; domain=.google.de
Set-Cookie: NID=67=S11HcqAV454IGRGMRo-AJpxAPxClJeRs4DRkAJQ5vI3YBh4anN3qS0EVeiYX_4XDTGN-mY86xTBoJ3Ncca7eNSdtGjcaG31pbCOuqsZEQMWwKn-7-6Dnizx395snehdA; expires=Wed, 12-Mar-2014 19:10:04 GMT; path=/; domain=.google.de; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Transfer-Encoding: chunked

This is not a problem, the page you requested simply doesn't provide a Content-Length header in its response. As such, the HttpEntity#getContentLength() returns -1.

EntityUtils has a number of methods, some of which return a String.


Running curl more recently produces

> curl --head http://www.google.de
HTTP/1.1 200 OK
Date: Fri, 03 Apr 2020 15:38:18 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2020-04-03-15; expires=Sun, 03-May-2020 15:38:18 GMT; path=/; domain=.google.de; Secure
Set-Cookie: NID=201=H8GdKY8_vE5Ehy6qSkmQru13HqdGEj2tvZUFqvTDAVBxFoL4POI0swPtfI45v1TBjrJuAAfbcNMUddniIf9HHituCAFwUqmUFMDwxDYK5qUlcWiB1A64OcGp6PTT6LKur2r_3z-ToSvLf8RZhKWdny6E8SaArMpkaOqUEWp4aoQ; expires=Sat, 03-Oct-2020 15:38:18 GMT; path=/; domain=.google.de; HttpOnly
Transfer-Encoding: chunked
Accept-Ranges: none
Vary: Accept-Encoding

The headers contain a Transfer-Encoding value of chunked. With chunked, the response contains "chunks" preceded by their length. An HTTP client uses those to read the entire response.

The HTTP Specification states that the Content-Length header should not be present when Transfer-Encoding has a value of chunked and MUST be ignored if it is.




回答2:


Please notice that response header name Transfer-Encoding. Its value is chunked which means data is deliveryed block by block. Transfer-Encoding: chunked and Content-Length does not turn out at the same time. There are two reason.

  1. Server does not want sent content length.
  2. Or server do not know the content length when it flush a big size data whose size is large than server's buffer.

So when there is no content length header, you can find the size of each chunked block before body of content. For example:

HTTP/1.1 200 OK

Server: Apache-Coyote/1.1

Set-Cookie: JSESSIONID=8A7461DDA53B4C4DD0E89D73219CB5F8; Path=/

Content-Type: text/html;charset=UTF-8

Transfer-Encoding: chunked

Date: Wed, 18 Mar 2015 07:10:05 GMT

11

helloworld!

3

123

0

Above headers and content tell us, there are two block data. The size of first block is 11. the size of second block is 3. So the content length is 14 at all.

regards, Xici




回答3:


If you really want to get the content length without caring about the content, you can do this.

EntityUtils.toByteArray(httpResponse.getEntity()).length



来源:https://stackoverflow.com/questions/18726892/apache-httpclient-response-content-length-returns-1

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!