HttpClient 4 - how to capture last redirect URL

后端 未结 8 1910
北恋
北恋 2020-11-29 18:53

I have rather simple HttpClient 4 code that calls HttpGet to get HTML output. The HTML returns with scripts and image locations all set to local (e.g.

相关标签:
8条回答
  • 2020-11-29 19:11

    An IMHO improved way based upon ZZ Coder's solution is to use a ResponseInterceptor to simply track the last redirect location. That way you don't lose information e.g. after an hashtag. Without the response interceptor you lose the hashtag. Example: http://j.mp/OxbI23

    private static HttpClient createHttpClient() throws NoSuchAlgorithmException, KeyManagementException {
        SSLContext sslContext = SSLContext.getInstance("SSL");
        TrustManager[] trustAllCerts = new TrustManager[] { new TrustAllTrustManager() };
        sslContext.init(null, trustAllCerts, new java.security.SecureRandom());
    
        SSLSocketFactory sslSocketFactory = new SSLSocketFactory(sslContext);
        SchemeRegistry schemeRegistry = new SchemeRegistry();
        schemeRegistry.register(new Scheme("https", 443, sslSocketFactory));
        schemeRegistry.register(new Scheme("http", 80, new PlainSocketFactory()));
    
        HttpParams params = new BasicHttpParams();
        ClientConnectionManager cm = new org.apache.http.impl.conn.SingleClientConnManager(schemeRegistry);
    
        // some pages require a user agent
        AbstractHttpClient httpClient = new DefaultHttpClient(cm, params);
        HttpProtocolParams.setUserAgent(httpClient.getParams(), "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:13.0) Gecko/20100101 Firefox/13.0.1");
    
        httpClient.setRedirectStrategy(new RedirectStrategy());
    
        httpClient.addResponseInterceptor(new HttpResponseInterceptor() {
            @Override
            public void process(HttpResponse response, HttpContext context)
                    throws HttpException, IOException {
                if (response.containsHeader("Location")) {
                    Header[] locations = response.getHeaders("Location");
                    if (locations.length > 0)
                        context.setAttribute(LAST_REDIRECT_URL, locations[0].getValue());
                }
            }
        });
    
        return httpClient;
    }
    
    private String getUrlAfterRedirects(HttpContext context) {
        String lastRedirectUrl = (String) context.getAttribute(LAST_REDIRECT_URL);
        if (lastRedirectUrl != null)
            return lastRedirectUrl;
        else {
            HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute(ExecutionContext.HTTP_REQUEST);
            HttpHost currentHost = (HttpHost)  context.getAttribute(ExecutionContext.HTTP_TARGET_HOST);
            String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI());
            return currentUrl;
        }
    }
    
    public static final String LAST_REDIRECT_URL = "last_redirect_url";
    

    use it just like ZZ Coder's solution:

    HttpResponse response = httpClient.execute(httpGet, context);
    String url = getUrlAfterRedirects(context);
    
    0 讨论(0)
  • 2020-11-29 19:11

    I think easier way to find last URL is to use DefaultRedirectHandler.

    package ru.test.test;
    
    import java.net.URI;
    
    import org.apache.http.HttpResponse;
    import org.apache.http.ProtocolException;
    import org.apache.http.impl.client.DefaultRedirectHandler;
    import org.apache.http.protocol.HttpContext;
    
    public class MyRedirectHandler extends DefaultRedirectHandler {
    
        public URI lastRedirectedUri;
    
        @Override
        public boolean isRedirectRequested(HttpResponse response, HttpContext context) {
    
            return super.isRedirectRequested(response, context);
        }
    
        @Override
        public URI getLocationURI(HttpResponse response, HttpContext context)
                throws ProtocolException {
    
            lastRedirectedUri = super.getLocationURI(response, context);
    
            return lastRedirectedUri;
        }
    
    }
    

    Code to use this handler:

      DefaultHttpClient httpclient = new DefaultHttpClient();
      MyRedirectHandler handler = new MyRedirectHandler();
      httpclient.setRedirectHandler(handler);
    
      HttpGet get = new HttpGet(url);
    
      HttpResponse response = httpclient.execute(get);
    
      HttpEntity entity = response.getEntity();
      lastUrl = url;
      if(handler.lastRedirectedUri != null){
          lastUrl = handler.lastRedirectedUri.toString();
      }
    
    0 讨论(0)
  • 2020-11-29 19:14

    This is how I managed to get the redirect URL:

    Header[] arr = httpResponse.getHeaders("Location");
    for (Header head : arr){
        String whatever = arr.getValue();
    }
    

    Or, if you are sure that there is only one redirect location, do this:

    httpResponse.getFirstHeader("Location").getValue();
    
    0 讨论(0)
  • 2020-11-29 19:17

    That would be the current URL, which you can get by calling

      HttpGet#getURI();
    

    EDIT: You didn't mention how you are doing redirect. That works for us because we handle the 302 ourselves.

    Sounds like you are using DefaultRedirectHandler. We used to do that. It's kind of tricky to get the current URL. You need to use your own context. Here are the relevant code snippets,

            HttpGet httpget = new HttpGet(url);
            HttpContext context = new BasicHttpContext(); 
            HttpResponse response = httpClient.execute(httpget, context); 
            if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK)
                throw new IOException(response.getStatusLine().toString());
            HttpUriRequest currentReq = (HttpUriRequest) context.getAttribute( 
                    ExecutionContext.HTTP_REQUEST);
            HttpHost currentHost = (HttpHost)  context.getAttribute( 
                    ExecutionContext.HTTP_TARGET_HOST);
            String currentUrl = (currentReq.getURI().isAbsolute()) ? currentReq.getURI().toString() : (currentHost.toURI() + currentReq.getURI());
    

    The default redirect didn't work for us so we changed but I forgot what was the problem.

    0 讨论(0)
  • 2020-11-29 19:18

    I found this on HttpComponents Client Documentation

    CloseableHttpClient httpclient = HttpClients.createDefault();
    HttpClientContext context = HttpClientContext.create();
    HttpGet httpget = new HttpGet("http://localhost:8080/");
    CloseableHttpResponse response = httpclient.execute(httpget, context);
    try {
        HttpHost target = context.getTargetHost();
        List<URI> redirectLocations = context.getRedirectLocations();
        URI location = URIUtils.resolve(httpget.getURI(), target, redirectLocations);
        System.out.println("Final HTTP location: " + location.toASCIIString());
        // Expected to be an absolute URI
    } finally {
        response.close();
    }
    
    0 讨论(0)
  • 2020-11-29 19:25
        HttpGet httpGet = new HttpHead("<put your URL here>");
        HttpClient httpClient = HttpClients.createDefault();
        HttpClientContext context = HttpClientContext.create();
        httpClient.execute(httpGet, context);
        List<URI> redirectURIs = context.getRedirectLocations();
        if (redirectURIs != null && !redirectURIs.isEmpty()) {
            for (URI redirectURI : redirectURIs) {
                System.out.println("Redirect URI: " + redirectURI);
            }
            URI finalURI = redirectURIs.get(redirectURIs.size() - 1);
        }
    
    0 讨论(0)
提交回复
热议问题