I new to spark, I have dataframe df:
+----------+------------+-----------+
| Column1 | Column2 | Sub |
+----------+------
You can use when
function as
import org.apache.spark.sql.functions._
df.withColumn("Sub", when(col("Column1").isNull, lit(0)).otherwise(col("Column1")) - when(col("Column2").isNull, lit(0)).otherwise(col("Column2")))
you should have final result as
+-------+-------+----+
|Column1|Column2| Sub|
+-------+-------+----+
| 1| 2|-1.0|
| 4| null| 4.0|
| 5| null| 5.0|
| 6| 8|-2.0|
+-------+-------+----+
You can coalesce
nulls to zero on both columns and then do the subtraction:
val df = Seq((Some(1), Some(2)),
(Some(4), null),
(Some(5), null),
(Some(6), Some(8))
).toDF("A", "B")
df.withColumn("Sub", abs(coalesce($"A", lit(0)) - coalesce($"B", lit(0)))).show
+---+----+---+
| A| B|Sub|
+---+----+---+
| 1| 2| 1|
| 4|null| 4|
| 5|null| 5|
| 6| 8| 2|
+---+----+---+