The issues:
1) Spark doesn\'t call UDF if input is column of primitive type that contains null
:
inputDF.show()
+-----+
| x |
+-----+
I would also use Artur's solution, but there is also another way without using javas wrapper classes by using struct
:
import org.apache.spark.sql.functions.struct
import org.apache.spark.sql.Row
inputDF
.withColumn("y",
udf { (r: Row) =>
if (r.isNullAt(0)) Some(1) else None
}.apply(struct($"x"))
)
.show()