Spark Select with a List of Columns Scala

后端 未结 1 2064
渐次进展
渐次进展 2021-02-08 23:13

I am trying to find a good way of doing a spark select with a List[Column, I am exploding a column than passing back all the columns I am interested in with my exploded column.<

相关标签:
1条回答
  • 2021-02-08 23:20

    For spark 2.0 seems that you have two options. Both depends on how you manage your columns (Strings or Columns).

    Spark code (spark-sql_2.11/org/apache/spark/sql/Dataset.scala):

    def select(cols: Column*): DataFrame = withPlan {
      Project(cols.map(_.named), logicalPlan)
    }
    
    def select(col: String, cols: String*): DataFrame = select((col +: cols).map(Column(_)) : _*)
    

    You can see how internally spark is converting your head & tail to a list of Columns to call again Select.

    So, in that case if you want a clear code I will recommend:

    If columns: List[String]:

    import org.apache.spark.sql.functions.col
    df.select(columns.map(col): _*)
    

    Otherwise, if columns: List[Columns]:

    df.select(columns: _*)
    
    0 讨论(0)
提交回复
热议问题