问题
Why does the following Code returns -1? Seems that the request failed.
public static void main(String[] args)
{
DefaultHttpClient httpClient = new DefaultHttpClient();
HttpGet httpGet = new HttpGet("http://www.google.de");
HttpResponse response;
try
{
response = httpClient.execute(httpGet);
HttpEntity entity = response.getEntity();
EntityUtils.consume(entity);
// Prints -1
System.out.println(entity.getContentLength());
}
catch (ClientProtocolException e)
{
e.printStackTrace();
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
httpGet.releaseConnection();
}
}
And is it possible to get the response as String?
回答1:
Try running
Header[] headers = response.getAllHeaders();
for (Header header : headers) {
System.out.println(header);
}
It will print
Date: Tue, 10 Sep 2013 19:10:04 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=dad7e2356ddb3b7a:FF=0:TM=1378840204:LM=1378840204:S=vQcLzVPbOOTxfvL4; expires=Thu, 10-Sep-2015 19:10:04 GMT; path=/; domain=.google.de
Set-Cookie: NID=67=S11HcqAV454IGRGMRo-AJpxAPxClJeRs4DRkAJQ5vI3YBh4anN3qS0EVeiYX_4XDTGN-mY86xTBoJ3Ncca7eNSdtGjcaG31pbCOuqsZEQMWwKn-7-6Dnizx395snehdA; expires=Wed, 12-Mar-2014 19:10:04 GMT; path=/; domain=.google.de; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Alternate-Protocol: 80:quic
Transfer-Encoding: chunked
This is not a problem, the page you requested simply doesn't provide a Content-Length
header in its response. As such, the HttpEntity#getContentLength()
returns -1
.
EntityUtils has a number of methods, some of which return a String
.
Running curl
more recently produces
> curl --head http://www.google.de
HTTP/1.1 200 OK
Date: Fri, 03 Apr 2020 15:38:18 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: 1P_JAR=2020-04-03-15; expires=Sun, 03-May-2020 15:38:18 GMT; path=/; domain=.google.de; Secure
Set-Cookie: NID=201=H8GdKY8_vE5Ehy6qSkmQru13HqdGEj2tvZUFqvTDAVBxFoL4POI0swPtfI45v1TBjrJuAAfbcNMUddniIf9HHituCAFwUqmUFMDwxDYK5qUlcWiB1A64OcGp6PTT6LKur2r_3z-ToSvLf8RZhKWdny6E8SaArMpkaOqUEWp4aoQ; expires=Sat, 03-Oct-2020 15:38:18 GMT; path=/; domain=.google.de; HttpOnly
Transfer-Encoding: chunked
Accept-Ranges: none
Vary: Accept-Encoding
The headers contain a Transfer-Encoding
value of chunked
. With chunked, the response contains "chunks" preceded by their length. An HTTP client uses those to read the entire response.
The HTTP Specification states that the Content-Length
header should not be present when Transfer-Encoding
has a value of chunked
and MUST be ignored if it is.
回答2:
Please notice that response header name Transfer-Encoding. Its value is chunked which means data is deliveryed block by block. Transfer-Encoding: chunked and Content-Length does not turn out at the same time. There are two reason.
- Server does not want sent content length.
- Or server do not know the content length when it flush a big size data whose size is large than server's buffer.
So when there is no content length header, you can find the size of each chunked block before body of content. For example:
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Set-Cookie: JSESSIONID=8A7461DDA53B4C4DD0E89D73219CB5F8; Path=/
Content-Type: text/html;charset=UTF-8
Transfer-Encoding: chunked
Date: Wed, 18 Mar 2015 07:10:05 GMT
11
helloworld!
3
123
0
Above headers and content tell us, there are two block data. The size of first block is 11. the size of second block is 3. So the content length is 14 at all.
regards, Xici
回答3:
If you really want to get the content length without caring about the content, you can do this.
EntityUtils.toByteArray(httpResponse.getEntity()).length
来源:https://stackoverflow.com/questions/18726892/apache-httpclient-response-content-length-returns-1