Merge two RDDs in Spark Scala

后端 未结 2 713
清歌不尽
清歌不尽 2021-01-21 04:01

I have two RDDs.

rdd1 = (String, String)

key1, value11
key2, value12
key3, value13

rdd2 = (String, String)

key2, value2         


        
2条回答
  •  离开以前
    2021-01-21 04:33

    I think this may be what you are looking for:

    join(otherDataset, [numTasks])  
    

    When called on datasets of type (K, V) and (K, W), returns a dataset of (K, (V, W)) pairs with all pairs of elements for each key. Outer joins are supported through leftOuterJoin, rightOuterJoin, and fullOuterJoin.

    See the associated section of the docs

提交回复
热议问题