Below is a screenshot of my topologies\' Storm UI. This was taken after the topology finished processing 10k messages.
(The topology is configured with 4 workers and use
There can be multiple reasons. First, of all you need to understand how the number are measured.
Spout.ack()
is called.Bolt.execute()
.Bolt.execute()
is called until the bolt acks the given input tuple.If you do not ack each incoming input tuple in Bolt.execute
immediately (which is absolutely ok), processing latency can be much higher than execution latency.
Furthermore, the processing latencies must not add up to the complete latency because tuple can stay in internal input/output buffers. This add additional time, until the last ack is done, thus increasing complete latency. Furthermore, the ackers need to process all incoming acks and notify the Spout about fully processed tuples. This also adds to the complete latency.
To the problem could be to large internal buffers between operators. This could be resolve by either increasing the dop(degree of parallelism) or by setting parameter TOPOLOGY_MAX_SPOUT_PEDING
-- this limits the number of tuple within the topology. Thus, if too many tuples are in-flight the spout stops to emit tuples until it received acks. Therefore, tuples does not queue up in internal buffers and complete latency goes down. If this does not help, you might need to increase the number of ackers. If the acks are not processed fast enough, the acks could buffer up, increasing the complete latency, too.