I am trying to parse the html of the following URL:
https://www.smuc.ac.kr/mbs/smuc/jsp/board/list.jsp?boardId=6993&id=smuc_040100000000
I\'m getting the
Following George's comment, I will post this as an answer.
I made a Kotlin version for checklist's solution, as follows:
package crawlers
import java.security.KeyManagementException
import java.security.NoSuchAlgorithmException
import java.security.SecureRandom
import java.security.cert.X509Certificate
import javax.net.ssl.SSLContext
import javax.net.ssl.SSLSocketFactory
import javax.net.ssl.TrustManager
import javax.net.ssl.X509TrustManager
class SSLHelperKotlin {
companion object {
@JvmStatic
fun socketFactory(): SSLSocketFactory {
val trustAllCerts = arrayOf<TrustManager>(
object : X509TrustManager {
override fun getAcceptedIssuers() = arrayOf<X509Certificate>()
override fun checkClientTrusted(certs: Array<X509Certificate>, authType: String) {}
override fun checkServerTrusted(certs: Array<X509Certificate>, authType: String) {}
}
)
return try {
val sslContext: SSLContext = SSLContext.getInstance("SSL")
sslContext.init(null, trustAllCerts, SecureRandom())
sslContext.socketFactory
} catch (e: NoSuchAlgorithmException) {
throw RuntimeException("Failed to create a SSL socket factory", e)
} catch (e: KeyManagementException) {
throw RuntimeException("Failed to create a SSL socket factory", e)
}
}
}
}
It's also available on this gist.
You can ignore TSL validation, set validateTLSCertificates(false)
:
Document document = Jsoup.connect("URL").timeout(10000).validateTLSCertificates(false).get();
Since reading the page also takes a while, increase the timeout timeout(10000)
.
The selected answer will not work with latest releases of JSoup as validateTLSCertificates is deprecated and removed. I have created the following helper class :
public class SSLHelper {
static public Connection getConnection(String url){
return Jsoup.connect(url).sslSocketFactory(SSLHelper.socketFactory());
}
static private SSLSocketFactory socketFactory() {
TrustManager[] trustAllCerts = new TrustManager[]{new X509TrustManager() {
public java.security.cert.X509Certificate[] getAcceptedIssuers() {
return new X509Certificate[0];
}
public void checkClientTrusted(X509Certificate[] certs, String authType) {
}
public void checkServerTrusted(X509Certificate[] certs, String authType) {
}
}};
try {
SSLContext sslContext = SSLContext.getInstance("SSL");
sslContext.init(null, trustAllCerts, new java.security.SecureRandom());
SSLSocketFactory result = sslContext.getSocketFactory();
return result;
} catch (NoSuchAlgorithmException | KeyManagementException e) {
throw new RuntimeException("Failed to create a SSL socket factory", e);
}
}
}
Then I simply call it as follows:
Document doc = SSLHelper.getConnection(url).userAgent(USER_AGENT).get();
(*) - https://dzone.com/articles/how-setup-custom - helpful in coming up with the solution