Using Pyspark, how can I select/keep all columns of a DataFrame which contain a non-null value; or equivalently remove all columns which contain no data.
Or just
from pyspark.sql.functions import col for c in df.columns: if df.filter(col(c).isNotNull()).count() == 0: df = df.drop(c)