URLConnection is not allowing me to access data on Http errors (404,500,etc)

前端 未结 2 1607
夕颜
夕颜 2021-02-01 18:33

I am making a crawler, and need to get the data from the stream regardless if it is a 200 or not. CURL is doing it, as well as any standard browser.

The following will

2条回答
  •  无人共我
    2021-02-01 18:43

    Simple:

    URLConnection connection = url.openConnection();
    InputStream is = connection.getInputStream();
    if (connection instanceof HttpURLConnection) {
       HttpURLConnection httpConn = (HttpURLConnection) connection;
       int statusCode = httpConn.getResponseCode();
       if (statusCode != 200 /* or statusCode >= 200 && statusCode < 300 */) {
         is = httpConn.getErrorStream();
       }
    }
    

    You can refer to Javadoc for explanation. The best way I would handle this is as follows:

    URLConnection connection = url.openConnection();
    InputStream is = null;
    try {
        is = connection.getInputStream();
    } catch (IOException ioe) {
        if (connection instanceof HttpURLConnection) {
            HttpURLConnection httpConn = (HttpURLConnection) connection;
            int statusCode = httpConn.getResponseCode();
            if (statusCode != 200) {
                is = httpConn.getErrorStream();
            }
        }
    }
    

提交回复
热议问题