Join two DataFrames where the join key is different and only select some columns

前端 未结 3 518
别跟我提以往
别跟我提以往 2021-01-19 04:12

What I would like to do is:

Join two DataFrames A and B using their respective id columns a_id and b_id<

3条回答
  •  南笙
    南笙 (楼主)
    2021-01-19 04:47

    Try this solution:

    A_B = A.join(B,col('B.id') == col('A.id')).select([col('A.'+xx) for xx in A.columns]
          + [col('B.other1'),col('B.other2')])
    

    The below lines in SELECT played the trick of selecting all columns from A and 2 columns from Table B.

    [col('a.'+xx) for xx in a.columns] : all columns in a
    
    [col('b.other1'),col('b.other2')] : some columns of b
    

提交回复
热议问题