Drop if all entries in a spark dataframe's specific column is null

后端未结

关注

 8  1177

轮回少年 2021-01-13 19:11

Using Pyspark, how can I select/keep all columns of a DataFrame which contain a non-null value; or equivalently remove all columns which contain no data.

8条回答

礼貌的吻别 (楼主)

2021-01-13 19:35

Or just

from pyspark.sql.functions import col

for c in df.columns:
    if df.filter(col(c).isNotNull()).count() == 0:
      df = df.drop(c)

0 讨论(0)

查看其它8个回答