How to get the cartesian product of two DStream in Spark Streaming with Scala?

穿精又带淫゛_ 提交于 2019-12-11 06:13:34

问题


I have two DStreams. Let A:DStream[X] and B:DStream[Y].

I want to get the cartesian product of them, in other words, a new C:DStream[(X, Y)] containing all the pairs of X and Y values.

I know there is a cartesian function for RDDs. I was only able to find this similar question but it's in Java and so does not answer my question.


回答1:


The Scala equivalent of the linked question's answer (ignoring Time v3, which isn't used there) is

A.transformWith(B, (rddA: RDD[X], rddB: RDD[Y]) => rddA.cartesian(rddB))

or shorter

A.transformWith(B, (_: RDD[X]).cartesian(_: RDD[Y]))


来源:https://stackoverflow.com/questions/38433509/how-to-get-the-cartesian-product-of-two-dstream-in-spark-streaming-with-scala

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!