spark data frame operation row and column level useing scala

后端 未结 1 1310
野的像风
野的像风 2021-01-29 14:43

Original Data frame
0.2 0.3

+------+------------- -+
|  name| country |
+------+---------------+
|Raju  |UAS         |
|Ram  |Pak.         |
|nul         


        
相关标签:
1条回答
  • 2021-01-29 15:21

    Using StringUtils of apache:

    val transcodificationName: UserDefinedFunction =
        udf { (name: String) => {
            if (StringUtils.isBlank(name)) 0.0
            else 0.2
            }
        }
    val transcodificationCountry: UserDefinedFunction =
        udf { (country: String) => {
            if (StringUtils.isBlank(country)) 0.0
            else 0.3
            }
        }
    
    dataframe
        .withColumn("Nwet", transcodificationName(col("name"))).cast(DoubleType)
        .withColumn("wetCon", transcodificationCountry(col("country"))).cast(DoubleType)
        .select("Nwet", "wetcon")
    

    edit:

    val transcodificationColumns: UserDefinedFunction =
            udf { (input: String, columnName:String) => {
                    if (StringUtils.isBlank(country)) 0.0
                    else if(columnName.equals("name")) 0.2
                    else if(columnName.equals("country") 0.3
                    else 0.0
                }
            }
    
    
        dataframe
            .withColumn("Nwet", transcodificationColumns(col("name"), "name")).cast(DoubleType)
            .withColumn("wetCon", transcodificationColumns(col("country")), "country").cast(DoubleType)
            .select("Nwet", "wetcon")
    
    0 讨论(0)
提交回复
热议问题