Given the following:
val rdd = List(1,2,3)
I assumed that rdd.reduce((x,y) => (x - y))
would return -4
(i.e. <
As aforementioned by @TzachZohar the function must satisfy the two properties so that the parallel computation is sound; by collecting the rdd, reduce
relaxes the properties required in the function, and so it produces the result from a sequential (non parallel) computation, namely,
val rdd = sc.parallelize(1 to 3)
rdd.collect.reduce((x,y) => (x-y))
Int = -4