How to check programatically if url of page is redirecting?

前端 未结 2 1215
刺人心
刺人心 2021-01-19 08:12

I am trying to extract the content of a webpage A. Using groovy I\'ve tried the following

......
String urlStr = \"url-of-webpage-A\"
String pageText = urlSt         


        
相关标签:
2条回答
  • 2021-01-19 08:22

    In Java you can use URL.openConnection() to get a HttpURLConnection (you'll need to cast). On this you can call setInstanceFollowRedirects(false).

    Then you can use getResponseCode() and see if HTTP_MOVED_PERM (301), HTTP_MOVED_TEMP (302) or HTTP_SEE_OTHER (303). They all indicate redirection.

    If you need to know where you're being redirected to, then you can use getHeaderField("Location") to get the location header.

    0 讨论(0)
  • 2021-01-19 08:39

    In groovy, you could do what Joachim suggests by doing:

    String location = "url-of-webpage-A"
    boolean wasRedirected = false
    String pageContent = null
    
    while( location ) {
      new URL( location ).openConnection().with { con ->
        // We'll do redirects ourselves
        con.instanceFollowRedirects = false
    
        // Get the response code, and the location to jump to (in case of a redirect)
        location = con.getHeaderField( "Location" )
        if( !wasRedirected && location ) {
          wasRedirected = true
        }
    
        // Read the HTML and close the inputstream
        pageContent = con.inputStream.withReader { it.text }
      }
    }
    
    println "wasRedirected:$wasRedirected contentLength:${pageContent.length()}"
    

    If you don't want to be redirected, and want the contents of the first page, you simply need to do:

    String location = "url-of-webpage-A"
    String pageContent = new URL( location ).openConnection().with { con ->
      // We'll do redirects ourselves
      con.instanceFollowRedirects = false
    
      // Get the location to jump to (in case of a redirect)
      location = con.getHeaderField( "Location" )
    
      // Read the HTML and close the inputstream
      con.inputStream.withReader { it.text }
    }
    
    if( location ) { 
      println "Page wanted to redirect to $location"
    }
    println "Content was:"
    println pageContent    
    
    0 讨论(0)
提交回复
热议问题