I can\'t understand why Java\'s HttpURLConnection
does not follow an HTTP redirect from an HTTP to an HTTPS URL. I use the following code to get the page at htt
HTTPUrlConnection is not responsible for handling the response of the object. It is performance as expected, it grabs the content of the URL requested. It is up to you the user of the functionality to interpret the response. It is not able to read the intentions of the developer without specification.
Redirects are followed only if they use the same protocol. (See the followRedirect() method in the source.) There is no way to disable this check.
Even though we know it mirrors HTTP, from the HTTP protocol point of view, HTTPS is just some other, completely different, unknown protocol. It would be unsafe to follow the redirect without user approval.
For example, suppose the application is set up to perform client authentication automatically. The user expects to be surfing anonymously because he's using HTTP. But if his client follows HTTPS without asking, his identity is revealed to the server.
HttpURLConnection by design won't automatically redirect from HTTP to HTTPS (or vice versa). Following the redirect may have serious security consequences. SSL (hence HTTPS) creates a session that is unique to the user. This session can be reused for multiple requests. Thus, the server can track all of the requests made from a single person. This is a weak form of identity and is exploitable. Also, the SSL handshake can ask for the client's certificate. If sent to the server, then the client's identity is given to the server.
As erickson points out, suppose the application is set up to perform client authentication automatically. The user expects to be surfing anonymously because he's using HTTP. But if his client follows HTTPS without asking, his identity is revealed to the server.
The programmer has to take extra steps to ensure that credentials, client certificates or SSL session id will not be sent before redirecting from HTTP to HTTPS. The default is to send these. If the redirection hurts the user, do not follow the redirection. This is why automatic redirect is not supported.
With that understood, here's the code which will follow the redirects.
URL resourceUrl, base, next;
Map<String, Integer> visited;
HttpURLConnection conn;
String location;
int times;
...
visited = new HashMap<>();
while (true)
{
times = visited.compute(url, (key, count) -> count == null ? 1 : count + 1);
if (times > 3)
throw new IOException("Stuck in redirect loop");
resourceUrl = new URL(url);
conn = (HttpURLConnection) resourceUrl.openConnection();
conn.setConnectTimeout(15000);
conn.setReadTimeout(15000);
conn.setInstanceFollowRedirects(false); // Make the logic below easier to detect redirections
conn.setRequestProperty("User-Agent", "Mozilla/5.0...");
switch (conn.getResponseCode())
{
case HttpURLConnection.HTTP_MOVED_PERM:
case HttpURLConnection.HTTP_MOVED_TEMP:
location = conn.getHeaderField("Location");
location = URLDecoder.decode(location, "UTF-8");
base = new URL(url);
next = new URL(base, location); // Deal with relative URLs
url = next.toExternalForm();
continue;
}
break;
}
is = conn.openStream();
...
Has something called HttpURLConnection.setFollowRedirects(false)
by any chance?
You could always call
conn.setInstanceFollowRedirects(true);
if you want to make sure you don't affect the rest of the behaviour of the app.
Another option can be to use Apache HttpComponents Client:
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
</dependency>
Sample code:
CloseableHttpClient httpclient = HttpClients.createDefault();
HttpGet httpget = new HttpGet("https://media-hearth.cursecdn.com/avatars/330/498/212.png");
CloseableHttpResponse response = httpclient.execute(httpget);
final HttpEntity entity = response.getEntity();
final InputStream is = entity.getContent();
As mentioned by some of you above, the setFollowRedirect and setInstanceFollowRedirects only work automatically when the redirected protocol is same . ie from http to http and https to https.
setFolloRedirect is at class level and sets this for all instances of the url connection, whereas setInstanceFollowRedirects is only for a given instance. This way we can have different behavior for different instances.
I found a very good example here http://www.mkyong.com/java/java-httpurlconnection-follow-redirect-example/