问题
I have two DStreams. Let A:DStream[X]
and B:DStream[Y]
.
I want to get the cartesian product of them, in other words, a new C:DStream[(X, Y)]
containing all the pairs of X
and Y
values.
I know there is a cartesian
function for RDDs. I was only able to find this similar question but it's in Java and so does not answer my question.
回答1:
The Scala equivalent of the linked question's answer (ignoring Time v3
, which isn't used there) is
A.transformWith(B, (rddA: RDD[X], rddB: RDD[Y]) => rddA.cartesian(rddB))
or shorter
A.transformWith(B, (_: RDD[X]).cartesian(_: RDD[Y]))
来源:https://stackoverflow.com/questions/38433509/how-to-get-the-cartesian-product-of-two-dstream-in-spark-streaming-with-scala