HttpClient Get images from response

筅森魡賤 提交于 2019-12-24 12:18:08

问题


I'm using Apache HttpClient to perform GET/POST requests,

I was wondering if you could save the images loaded/retrieved by a response, without having to download them again with their URLs.

This question has been asked like one year ago, but no one answered: Can I get cached images using HttpClient?

I tried:

CloseableHttpClient httpclient = HttpClients.createDefault();

HttpGet httpget = new HttpGet(url);

HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();

InputStream is = entity.getContent();

FileOutputStream fos = new FileOutputStream(new File("img.png"));
int inByte;
while ((inByte = is.read()) != -1) {
    fos.write(inByte);
}
is.close();
fos.close();

but apparently it's downloading only text, can i make HttpClient download images of that particular URL or not? Is this doable or not?


回答1:


A web page is just the HTML code of the page.

When a browser accesses a webpage, it downloads the HTML code, and then parses the HTML. If there are things like IMG tags, embeded objects (like Flash, Applets etc.), frames and so on, the browser takes their URL, and creates a new HTTP connection, in which it downloads the image. It does so for every image. And then, having all the various parts of the page ready (in cache), it renders the page.

This is a simplified description, of course, as browsers tend to optimize these things by keeping connections open and keeping caches around. So to reiterate, to get the images in a page:

  1. Download HTML from the given URL.
  2. Parse the HTML and find the IMG tags.
  3. For every relevant IMG, download the image data from the SRC URL associated with it. You should save them to a file.

It is important to understand that an HttpClient response only represents one object - the HTML page, or a single image, depending what URL you gave it. If you want to download an entire page and all its images, you have to use an HttpClient for each of the objects yourself - it doesn't do so automatically.



来源:https://stackoverflow.com/questions/26450715/httpclient-get-images-from-response

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!