PySpark: Add a new column with a tuple created from columns
问题 Here I have a dateframe created as follow, df = spark.createDataFrame([('a',5,'R','X'),('b',7,'G','S'),('c',8,'G','S')], ["Id","V1","V2","V3"]) It looks like +---+---+---+---+ | Id| V1| V2| V3| +---+---+---+---+ | a| 5| R| X| | b| 7| G| S| | c| 8| G| S| +---+---+---+---+ I'm looking to add a column that is a tuple consisting of V1,V2,V3. The result should look like +---+---+---+---+-------+ | Id| V1| V2| V3|V_tuple| +---+---+---+---+-------+ | a| 5| R| X|(5,R,X)| | b| 7| G| S|(7,G,S)| | c| 8|