问题
We are consistently getting the following error when we increase either the number of threads or the number of executors for Fetcher bolt.
org.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:286) ~[stormjar.jar:?]
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:263) ~[stormjar.jar:?]
at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190) ~[stormjar.jar:?]
at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184) ~[stormjar.jar:?]
at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184) ~[stormjar.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:71) ~[stormjar.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:220) ~[stormjar.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:164) ~[stormjar.jar:?]
at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:139) ~[stormjar.jar:?]
at com.digitalpebble.stormcrawler.protocol.httpclient.HttpProtocol.getProtocolOutput(HttpProtocol.java:206) ~[stormjar.jar:?]
Is this due to a resource leak or some hard limit on the size of the http thread pool? If it is about the thread pool, is there any way to increase the pool size?
回答1:
There is a max number of connections for the pool set in HttpProtocol, which is the number of threads used (fetcher.threads.number). Since the pool is static, it is used by all the executors on the same worker. I'd recommend that you use one FetcherBolt instance per worker, it will then be the same value as fetcher.threads.number and you won't have this problem.
Alternatively, you could give the okhttp protocol a try. It is more robust for open and large-scale crawls. See WIKI page on protocols for a feature comparison.
来源:https://stackoverflow.com/questions/49149490/stormcrawler-timeout-waiting-for-connection-from-pool