Drop if all entries in a spark dataframe's specific column is null

后端未结

关注

 8  1178

轮回少年 2021-01-13 19:11

Using Pyspark, how can I select/keep all columns of a DataFrame which contain a non-null value; or equivalently remove all columns which contain no data.

8条回答

感情败类 (楼主)

2021-01-13 19:49
for me it worked in a bit different way than @Suresh answer:
```
nonNull_cols = [c for c in original_df.columns if original_df.filter(func.col(c).isNotNull()).count() > 0]
new_df = original_df.select(*nonNull_cols)
```
0 讨论(0)

查看其它8个回答
发布评论:

提交评论
- 加载中...