问题
I'm using Apache HttpClient to perform GET/POST requests,
I was wondering if you could save the images loaded/retrieved by a response, without having to download them again with their URLs.
This question has been asked like one year ago, but no one answered: Can I get cached images using HttpClient?
I tried:
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet(url);
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
InputStream is = entity.getContent();
FileOutputStream fos = new FileOutputStream(new File("img.png"));
int inByte;
while ((inByte = is.read()) != -1) {
fos.write(inByte);
}
is.close();
fos.close();
but apparently it's downloading only text, can i make HttpClient
download images of that particular URL or not?
Is this doable or not?
回答1:
A web page is just the HTML code of the page.
When a browser accesses a webpage, it downloads the HTML code, and then parses the HTML. If there are things like IMG
tags, embeded objects (like Flash, Applets etc.), frames and so on, the browser takes their URL, and creates a new HTTP connection, in which it downloads the image. It does so for every image. And then, having all the various parts of the page ready (in cache), it renders the page.
This is a simplified description, of course, as browsers tend to optimize these things by keeping connections open and keeping caches around. So to reiterate, to get the images in a page:
- Download HTML from the given URL.
- Parse the HTML and find the IMG tags.
- For every relevant IMG, download the image data from the SRC URL associated with it. You should save them to a file.
It is important to understand that an HttpClient
response only represents one object - the HTML page, or a single image, depending what URL you gave it. If you want to download an entire page and all its images, you have to use an HttpClient
for each of the objects yourself - it doesn't do so automatically.
来源:https://stackoverflow.com/questions/26450715/httpclient-get-images-from-response