Java8 streams sequential and parallel execution produce different results?

前端 未结 3 1696
逝去的感伤
逝去的感伤 2020-12-05 06:17

Running the following stream example in Java8:

    System.out.println(Stream
        .of(\"a\", \"b\", \"c\", \"d\", \"e\", \"f\")
        .reduce(\"\", (s1,         


        
相关标签:
3条回答
  • 2020-12-05 06:46

    For someone who just started with lambdas and streams, it took quite some time to get to the "AHA" moment, until I really understood what is going on here. I'll rephrase this a bit to make a bit easier (at least how I wish it was really answered) for a stream newbie like me.

    It's all under the reduce documentation that states:

    The identity value MUST be an identity for the accumulator function. This means that for all t, accumulator.apply(identity, t) is equal to t.

    We can easily prove that the way code is, the associativity is broken:

    static private void isAssociative() {
         BinaryOperator<String> operator = (s1, s2) -> s1 + "/" + s2;
         String result = operator.apply("", "a");
         System.out.println(result); 
         System.out.println(result.equals("a")); 
    }
    

    An empty String concatenated with another String, should really produce the second String; which does not happen, thus accumulator (BinaryOperator) is NOT associative and thus the reduce method can not guarantee the same result in case of parallel invocation.

    0 讨论(0)
  • 2020-12-05 06:59

    From reduce's documentation:

    The identity value must be an identity for the accumulator function. This means that for all t, accumulator.apply(identity, t) is equal to t.

    Which is not true in your case - "" and "a" creates "/a".

    I have extracted the accumulator function and added a printout to show what happens:

    BinaryOperator<String> accumulator = (s1, s2) -> {
        System.out.println("joining \"" + s1 + "\" and \"" + s2 + "\"");
        return s1 + "/" + s2;
    };
    System.out.println(Stream
                    .of("a", "b", "c", "d", "e", "f")
                    .parallel()
                    .reduce("", accumulator)
    );
    

    This is example output (it differs between runs):

    joining "" and "d"
    joining "" and "f"
    joining "" and "b"
    joining "" and "a"
    joining "" and "c"
    joining "" and "e"
    joining "/b" and "/c"
    joining "/e" and "/f"
    joining "/a" and "/b//c"
    joining "/d" and "/e//f"
    joining "/a//b//c" and "/d//e//f"
    /a//b//c//d//e//f
    

    You can add an if statement to your function to handle empty string separately:

    System.out.println(Stream
            .of("a", "b", "c", "d", "e", "f")
            .parallel()
            .reduce((s1, s2) -> s1.isEmpty()? s2 : s1 + "/" + s2)
    );
    

    As Marko Topolnik noticed, checking s2 is not required as accumulator doesn't have to be commutative function.

    0 讨论(0)
  • 2020-12-05 07:09

    To add to other answer,

    You might want to use Mutable reduction, the doc specify that doing something like

    String concatenated = strings.reduce("", String::concat)
    

    Will give bad performance result.

    We would get the desired result, and it would even work in parallel. However, we might not be happy about the performance! Such an implementation would do a great deal of string copying, and the run time would be O(n^2) in the number of characters. A more performant approach would be to accumulate the results into a StringBuilder, which is a mutable container for accumulating strings. We can use the same technique to parallelize mutable reduction as we do with ordinary reduction.

    So you should use a StringBuilder instead.

    0 讨论(0)
提交回复
热议问题