When encountering a RuntimeException
during stream processing, should the stream processing abort? Should it first finish? Should the exception be rethrown on <
There is no difference in the behavior of these two streams regarding exception reporting, the problem is that you put both tests one after another into one method and let them access shared data structures.
There is a subtle, perhaps not sufficiently documented (if intended) behavior: when a stream operation completes exceptionally, it does not wait for the completion of all concurrent operations.
So when you catch the exception of the first stream operation, there are still some threads running and accessing your shared data. So when you reset your AtomicBoolean
, one of these threads belonging to the first job will read the false
value, turn it to true
, print the message and throw an exception which gets lost, as the stream operation already completed exceptionally. Further, some of these threads will raise your counter after you reset it, that’s why it has a higher number than the second job would allow. Your second job does not complete exceptionally, as all threads belonging to the second job will read a true
value from the AtomicBoolean
.
There are some ways to spot this.
When you remove the first stream operation, the second will complete exceptionally as expected. Also, inserting the statement
ForkJoinPool.commonPool().awaitQuiescence(1, TimeUnit.DAYS);
between the two stream operations will fix the problem, as it waits for the completion of all threads.
However, the cleaner solution would be to let both stream operations use their own counter and flag.
That said, there is a subtle, implementation dependent difference that causes the problem to disappear if you just swap the two operations. The IntStream.range
operation produces a stream with a known size, which allows splitting it into concurrent tasks which intrinsically know, how many elements to process. This allows abandoning these tasks in the exceptional case as described above. On the other hand, combining an infinite stream as returned by generate
with limit
does not produce a sized stream (though that would be possible). Since such a stream is treated as having an unknown size, the subtasks have to synchronize on a counter to ensure that the limit is obeyed. This causes the sub-tasks to (sometimes) complete, even in the exceptional case. But as said, that is a side effect of an implementation detail, not an intentional wait for completion. And since it’s about concurrency, the result might be different, if you run it multiple times.