Join two data frames, select all columns from one and some columns from the other
问题 Let's say I have a spark data frame df1, with several columns (among which the column 'id') and data frame df2 with two columns, 'id' and 'other'. Is there a way to replicate the following command sqlContext.sql("SELECT df1.*, df2.other FROM df1 JOIN df2 ON df1.id = df2.id") by using only pyspark functions such as join(), select() and the like? I have to implement this join in a function and I don't want to be forced to have sqlContext as a function parameter. Thanks! 回答1: Not sure if the