Unexpected results in Spark MapReduce

后端 未结 1 1615
时光说笑
时光说笑 2021-01-19 14:23

I\'m new to Spark and want to understand how MapReduce gets done under the hood to ensure I use it properly. This post provided a great answer, but my results don\'t seem to

相关标签:
1条回答
  • 2021-01-19 14:51

    It happens because subtraction is neither associative nor commutative. Lets start with associativity:

    (- (- (- 14 78) 73) 42) 
    (- (- -64 73) 42)
    (- -137 42) 
    -179
    

    is not the same as

    (- (- 14 78) (- 73 42))
    (- -64 (- 73 42))
    (- -64 31)
    -95
    

    Now its time for commutativity:

    (- (- (- 14 78) 73) 42) ;; From the previous example
    

    is not the same as

    (- (- (- 42 73) 78) 14)
    (- (- -31 78) 14)
    (- -109 14)
    -123
    

    Spark first applies reduce on individual partitions and then merges partial results in arbitrary order. If function you use doesn't meet one or both criteria final results can be non-deterministic.

    0 讨论(0)
提交回复
热议问题