parallel processing with infinite stream in Java

后端 未结 4 1854
夕颜
夕颜 2020-12-31 10:05

Why the below code doesn\'t print any output whereas if we remove parallel, it prints 0, 1?

IntStream.iterate(0, i -> ( i + 1 ) % 2)
         .parallel()
         


        
4条回答
  •  别那么骄傲
    2020-12-31 10:59

    The real cause is that ordered parallel .distinct() is the full barrier operation as described in documentation:

    Preserving stability for distinct() in parallel pipelines is relatively expensive (requires that the operation act as a full barrier, with substantial buffering overhead), and stability is often not needed.

    The "full barrier operation" means that all the upstream operations must be performed before the downstream can start. There are only two full barrier operations in Stream API: .sorted() (every time) and .distinct() (in ordered parallel case). As you have non short-circuit infinite stream supplied to the .distinct() you end up with infinite loop. By contract .distinct() cannot just emit elements to the downstream in any order: it should always emit the first repeating element. While it's theoretically possible to implement parallel ordered .distinct() better, it would be much more complex implementation.

    As for solution, @user140547 is right: add .unordered() before .distinct() this switches distinct() algorithm to unordered one (which just uses shared ConcurrentHashMap to store all the observed elements and emits every new element to the downstream). Note that adding .unordered() after .distinct() will not help.

提交回复
热议问题