Drop if all entries in a spark dataframe's specific column is null

后端 未结 8 1169
轮回少年
轮回少年 2021-01-13 19:11

Using Pyspark, how can I select/keep all columns of a DataFrame which contain a non-null value; or equivalently remove all columns which contain no data.

8条回答
  •  礼貌的吻别
    2021-01-13 19:35

    Or just

    from pyspark.sql.functions import col
    
    for c in df.columns:
        if df.filter(col(c).isNotNull()).count() == 0:
          df = df.drop(c)
    

提交回复
热议问题