I am trying to get the rows with null values from a pyspark dataframe. In pandas, I can achieve this using isnull()
on the dataframe:
df = df[df.
This is how you can do this in scala
import org.apache.spark.sql.functions._
case class Test(id:Int, weight:Option[Int], age:Int, gender: Option[String])
val df1 = Seq(Test(1, Some(100), 23, Some("Male")), Test(2, None, 25, None), Test(3, None, 33, Some("Female"))).toDF()
display(df1.filter(df1.columns.map(c => col(c).isNull).reduce((a,b) => a || b)))