Is flatMap guaranteed to be lazy? [duplicate]

若如初见. 提交于 2019-11-27 14:26:25
Eugene

Under the current implementation, flatmap is eager; like any other stateful intermediate operation (like sorted and distinct). And it's very easy to prove :

 int result = Stream.of(1)
            .flatMap(x -> Stream.generate(() -> ThreadLocalRandom.current().nextInt()))
            .findFirst()
            .get();

    System.out.println(result);

This never finishes as flatMap is computed eagerly. For your example:

urls.stream()
    .flatMap(url -> fetchDataFromInternet(url).stream())
    .filter(...)
    .findFirst()
    .get();

It means that for each url, the flatMap will block all others operation that come after it, even if you care about a single one. So let's suppose that from a single url your fetchDataFromInternet(url) generates 10_000 lines, well your findFirst will have to wait for all 10_000 to be computed, even if you care about only one.

EDIT

This is fixed in Java 10, where we get our laziness back: see JDK-8075939

EDIT 2

This is fixed in Java 8 too (8u222): JDK-8225328

It’s not clear why you set up an example that does not address the actual question, you’re interested in. If you want to know, whether the processing is lazy when applying a short-circuiting operation like findFirst(), well, then use an example using findFirst() instead of forEach that processes all elements anyway. Also, put the logging statement right into the function whose evaluation you want to track:

Stream.of("hello", "world")
      .flatMap(s -> {
          System.out.println("flatMap function evaluated for \""+s+'"');
          return s.chars().boxed();
      })
      .peek(c -> System.out.printf("processing element %c%n", c))
      .filter(c -> c>'h')
      .findFirst()
      .ifPresent(c -> System.out.printf("found an %c%n", c));
flatMap function evaluated for "hello"
processing element h
processing element e
processing element l
processing element l
processing element o
found an l

This demonstrates that the function passed to flatMap gets evaluated lazily as expected while the elements of the returned (sub-)stream are not evaluated as lazy as possible, as already discussed in the Q&A you have linked yourself.

So, regarding your fetchDataFromInternet method that gets invoked from the function passed to flatMap, you will get the desired laziness. But not for the data it returns.

Today I also stumbled up on this bug. Behavior is not so strait forward, cause simple case, like below, is working fine, but similar production code doesn't work.

 stream(spliterator).map(o -> o).flatMap(Stream::of)..flatMap(Stream::of).findAny()

For guys who cannot wait another couple years for migration to JDK-10 there is a alternative true lazy stream. It doesn't support parallel. It was dedicated for JavaScript translation, but it worked out for me, cause interface is the same.

StreamHelper is collection based, but it is easy to adapt Spliterator.

https://github.com/yaitskov/j4ts/blob/stream/src/main/java/javaemul/internal/stream/StreamHelper.java

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!