Spark scala remove columns containing only null values

前端 未结 3 1890
南笙
南笙 2021-01-12 13:47

Is there a way to remove the columns of a spark dataFrame that contain only null values ? (I am using scala and Spark 1.6.2)

At the moment I am doing this:



        
3条回答
  •  走了就别回头了
    2021-01-12 14:48

    If the dataframe is of reasonable size, I write it as json then reload it. The dynamic schema will ignore null columns and you'd have a lighter dataframe.

    scala snippet:

    originalDataFrame.write(tempJsonPath)
    val lightDataFrame = spark.read.json(tempJsonPath)
    

提交回复
热议问题