Why the below code doesn\'t print any output whereas if we remove parallel, it prints 0, 1?
IntStream.iterate(0, i -> ( i + 1 ) % 2)
.parallel()
The real cause is that ordered parallel .distinct()
is the full barrier operation as described in documentation:
Preserving stability for
distinct()
in parallel pipelines is relatively expensive (requires that the operation act as a full barrier, with substantial buffering overhead), and stability is often not needed.
The "full barrier operation" means that all the upstream operations must be performed before the downstream can start. There are only two full barrier operations in Stream API: .sorted()
(every time) and .distinct()
(in ordered parallel case). As you have non short-circuit infinite stream supplied to the .distinct()
you end up with infinite loop. By contract .distinct()
cannot just emit elements to the downstream in any order: it should always emit the first repeating element. While it's theoretically possible to implement parallel ordered .distinct()
better, it would be much more complex implementation.
As for solution, @user140547 is right: add .unordered()
before .distinct()
this switches distinct()
algorithm to unordered one (which just uses shared ConcurrentHashMap
to store all the observed elements and emits every new element to the downstream). Note that adding .unordered()
after .distinct()
will not help.