I have previously registered a UDF with hive. It is permanent not TEMPORARY
. It works in beeline.
CREATE FUNCTION normaliseURL AS \'com.example.hive
It will work on spark on yarn environment however as suggested you need to use spark-shell --jars <path-to-your-hive-udf>.jar
not in hdfs but in local.
Issue is Spark 2.0 is not able to execute the functions whose JARs are located on HDFS.
Spark SQL: Thriftserver unable to run a registered Hive UDTF
One workaround is to define the function as a temporary function in Spark job with jar path pointing to a local edge-node path. Then call the function in same Spark job.
CREATE TEMPORARY FUNCTION functionName as 'com.test.HiveUDF' USING JAR '/user/home/dir1/functions.jar'