collecting from parallel stream in java 8

那年仲夏 提交于 2019-11-30 08:37:25

The Collection object used to receive the data being collected does not need to be concurrent. You can give it a simple ArrayList.

That is because the collection of values from a parallel stream is not actually collected into a single Collection object. Each thread will collect their own data, and then all sub-results will be merged into a single final Collection object.

This is all well-documented in the Collector javadoc, and the Collector is the parameter you're giving to the collect() method:

<R,A> R collect(Collector<? super T,A,R> collector)

But there is no option that I can see to collect from parallel stream in thread safe way to provide list as output. This is entirely wrong.

The whole point in streams is that you can use a non-thread safe Collection to achieve perfectly valid thread-safe results. This is because of how streams are implemented (and this was a key part of the design of streams). You could see that a Collector defines a method supplier that at each step will create a new instance. Those instances will be merged between them.

So this is perfectly thread safe:

 Stream.of(1,2,3,4).parallel()
          .collect(Collectors.toList());

Since there are 4 elements in this stream, there will be 4 instances of ArrayList created that will be merged at the end to a single result (assuming at least 4 CPU cores)

On the other side methods like toConcurrent generate a single result container and all threads will put their result into it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!