I have the following apache spark udf in scala:
val myFunc = udf {
(userBias: Float, otherBiases: Map[Long, Float],
userFactors: Seq[Float], context: S
Spark 2.2+
You can use typedLit
functions:
import org.apache.spark.sql.functions.typedLit
myFunc(..., typedLit(context))
Spark < 2.2
Any argument that is passed directly to the UDF has to be a Column
so if you want to pass constant array you'll have to convert it to column literal:
import org.apache.spark.sql.functions.{array, lit}
val myFunc: org.apache.spark.sql.UserDefinedFunction = ???
myFunc(
userBias("bias"),
otherBias("biases"),
userFactors("features"),
// org.apache.spark.sql.Column
array(context.map(xs => array(xs.map(lit _): _*)): _*)
)
Non-Column
objects can be passed only indirectly using closure, for example like this:
def myFunc(context: Array[Seq[String]]) = udf {
(userBias: Float, otherBiases: Map[Long, Float], userFactors: Seq[Float]) =>
???
}
myFunc(context)(userBias("bias"), otherBias("biases"), userFactors("features"))