I have some code from PySpark 1.5 that I unfortunately have to port backwards to Spark 1.3. I have a column with elements that are alpha-numeric but I only want the digits. An e
As long as you use HiveContext you can execute corresponding Hive UDFs either with selectExpr:
HiveContext
selectExpr
df.selectExpr("regexp_extract(old_col,'([0-9]+)', 1)")
or with plain SQL:
df.registerTempTable("df") sqlContext.sql("SELECT regexp_extract(old_col,'([0-9]+)', 1) FROM df")