I\'m using Storm 0.8.1 to read incoming messages off an Amazon SQS queue and am getting consistent exceptions when doing so:
2013-12-02 02:21:38 executor [ER
Answering my own question here for the ages.
There's currently an XML expansion limit processing bug in Oracle and OpenJDK's Java that results in a shared counter hitting the default upper bound when parsing multiple XML documents.
Although I thought that our version (6b27-1.12.6-1ubuntu0.12.04.4) wasn't affected, running the sample code given in the OpenJDK bug report did indeed verify that we were susceptible to the bug.
To work around the issue, I needed to pass jdk.xml.entityExpansionLimit=0
to the Storm workers. By adding the following to storm.yaml
across my cluster, I was able to mitigate this problem.
supervisor.childopts: "-Djdk.xml.entityExpansionLimit=0"
worker.childopts: "-Djdk.xml.entityExpansionLimit=0"
I should note that this technically opens you up to a Denial of Service attack, but since our XML documents are only coming from SQS, I'm not worried about someone forging malevolent XML to kill our workers.