JSoup over VPN/proxy

前端 未结 3 1402
隐瞒了意图╮
隐瞒了意图╮ 2021-01-06 02:59

I\'m trying to use JSoup to scrape some pages that are on a staging server. To view the pages on the staging server with a browser I need to be connected to a VPN.

I

相关标签:
3条回答
  • 2021-01-06 03:29

    As of version 1.9 you can set it on the connection: https://jsoup.org/apidocs/org/jsoup/Connection.html#proxy-java.net.Proxy-

    JSoup.connect("http://your.url.here").proxy("<proxy-host>", <proxy-port>).get();
    
    0 讨论(0)
  • 2021-01-06 03:31

    You can set java properties for the proxy:

    // if you use https, set it here too
    System.setProperty("http.proxyHost", "<proxyip>"); // set proxy server
    System.setProperty("http.proxyPort", "<proxyport>"); // set proxy port
    
    Document doc = Jsoup.connect("http://your.url.here").get(); // Jsoup now connects via proxy
    

    or download the website into a string and parse it then:

    final URL website = new URL("http://your.url.here"); // The website you want to connect
    
    // -- Setup connection through proxy
    Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("<proxyserver>", 1234)); // set proxy server and port
    HttpURLConnection httpUrlConnetion = (HttpURLConnection) website.openConnection(proxy);
    httpUrlConnetion.connect();
    
    // -- Download the website into a buffer
    BufferedReader br = new BufferedReader(new InputStreamReader(httpUrlConnetion.getInputStream()));
    StringBuilder buffer = new StringBuilder();
    String str;
    
    while( (str = br.readLine()) != null )
    {
        buffer.append(str);
    }
    
    // -- Parse the buffer with Jsoup
    Document doc = Jsoup.parse(buffer.toString());
    

    You can use HttpClient for this solution as well.

    0 讨论(0)
  • 2021-01-06 03:33

    To add on for ollo if your proxy needs username/password authentication.

    final String authUser = <username>;
    final String authPassword = <password>;
    Authenticator.setDefault(
       new Authenticator() {
          public PasswordAuthentication getPasswordAuthentication() {
             return new PasswordAuthentication(
                   authUser, authPassword.toCharArray());
          }
       }
    );
    
    System.setProperty("http.proxyHost", <yourproxyhost>);
    System.setProperty("http.proxyPort", <yourproxyport>);
    System.setProperty("http.proxyUser", authUser);
    System.setProperty("http.proxyPassword", authPassword);
    
    Document doc = Jsoup.connect("http://your.url.here").get();
    
    0 讨论(0)
提交回复
热议问题