In the javaodoc for the stream package, at the end of the section Parallelism
, I read:
Most stream operations accept parameters that des
I have hard time understanding this "in most cases". In which cases is it acceptable/desirable to have a stateful stream operation?
Suppose following scenario. You have a Stream
and you need to list the items in natural order prefexing each one with order number. So, for example on input you have: Banana
, Apple
and Grape
. Output should be:
1. Apple
2. Banana
3. Grape
How you solve this task in Java Stream API? Pretty easily:
List f = asList("Banana", "Apple", "Grape");
AtomicInteger number = new AtomicInteger(0);
String result = f.stream()
.sorted()
.sequential()
.map(i -> String.format("%d. %s", number.incrementAndGet(), i))
.collect(Collectors.joining("\n"));
Now if you look at this pipeline you'll see 3 stateful operations:
sorted()
– stateful by definition. See documetation to Stream.sorted()
:
This is a stateful intermediate operation
map()
– by itself could be stateless or not, but in this case it is not. To label positions you need to keep track of how much items already labeled;collect()
– is mutable reduction operation (from docs to Stream.collect()
). Mutable operations are stateful by definition, because they change (mutate) shared state.There are some controversy about why sorted()
is stateful. From the Stream API documentation:
Stateless operations, such as filter and map, retain no state from previously seen element when processing a new element -- each element can be processed independently of operations on other elements. Stateful operations, such as distinct and sorted, may incorporate state from previously seen elements when processing new elements.
So when applying term stateful/stateless to a Stream API we're talking more about function processing element of a stream, and not about function processing stream as a whole.
Also note that there is some confusion between terms stateless and deterministic. They are not the same.
Deterministic function provide same result given same arguments.
Stateless function retain no state from previous calls.
Those are different definitions. And in general case doesn't depend on each other. Determinism is about function result value while statelessness about function implementation.