I am having trouble using a UDF on a column of Vectors in PySpark which can be illustrated here:
from pyspark import S
In spark-sql, vectors are treated (type, size, indices, value) tuple.
You can use udf on vectors with pyspark. Just modify some code to work with values in vector type.
vector_udf = udf(lambda vector: sum(vector[3]), DoubleType())
df.withColumn('feature_sums', vector_udf(df.features)).first()
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala