Original Data frame
0.2 0.3
+------+------------- -+
| name| country |
+------+---------------+
|Raju |UAS |
|Ram |Pak. |
|nul
Using StringUtils of apache:
val transcodificationName: UserDefinedFunction =
udf { (name: String) => {
if (StringUtils.isBlank(name)) 0.0
else 0.2
}
}
val transcodificationCountry: UserDefinedFunction =
udf { (country: String) => {
if (StringUtils.isBlank(country)) 0.0
else 0.3
}
}
dataframe
.withColumn("Nwet", transcodificationName(col("name"))).cast(DoubleType)
.withColumn("wetCon", transcodificationCountry(col("country"))).cast(DoubleType)
.select("Nwet", "wetcon")
edit:
val transcodificationColumns: UserDefinedFunction =
udf { (input: String, columnName:String) => {
if (StringUtils.isBlank(country)) 0.0
else if(columnName.equals("name")) 0.2
else if(columnName.equals("country") 0.3
else 0.0
}
}
dataframe
.withColumn("Nwet", transcodificationColumns(col("name"), "name")).cast(DoubleType)
.withColumn("wetCon", transcodificationColumns(col("country")), "country").cast(DoubleType)
.select("Nwet", "wetcon")