How to add proxy support to Jsoup?

前端 未结 7 1688
独厮守ぢ
独厮守ぢ 2020-11-28 03:47

I am a beginner to Java and my first task is to parse some 10,000 URLs and extract some info out of it, for this I am using Jsoup and it\'s working fine.

相关标签:
7条回答
  • 2020-11-28 04:48

    You don't have to get the webpage data through Jsoup. Here's my solution, it may not be the best though.

      URL url = new URL("http://www.example.com/");
      Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress("127.0.0.1", 8080)); // or whatever your proxy is
      HttpURLConnection uc = (HttpURLConnection)url.openConnection(proxy);
    
      uc.connect();
    
        String line = null;
        StringBuffer tmp = new StringBuffer();
        BufferedReader in = new BufferedReader(new InputStreamReader(uc.getInputStream()));
        while ((line = in.readLine()) != null) {
          tmp.append(line);
        }
    
        Document doc = Jsoup.parse(String.valueOf(tmp));
    

    And there it is. This gets the source of the html page through a proxy and then parses it with Jsoup.

    0 讨论(0)
提交回复
热议问题