copy current row , modify it and add a new row in spark

后端 未结 2 1598
面向向阳花
面向向阳花 2021-01-17 06:23

I am using spark-sql-2.4.1v with java8 version. I have a scenario where I need to copy current row and create another row modifying few columns data how can this be achieved

相关标签:
2条回答
  • 2021-01-17 06:30

    you can use this approach for your scenario,

    df.union(df.filter($"code"==="rate").withColumn("code",concat(lit("new_"), $"code"))).show()
    /*
    +---+--------+------+------+------+
    | id|    code|entity|value1|value2|
    +---+--------+------+------+------+
    | 20|   score|school|    14|    12|
    | 21|   score|school|    13|    13|
    | 22|    rate|school|    11|    14|
    | 22|new_rate|school|    11|    14|
    +---+--------+------+------+------+
    */
    
    0 讨论(0)
  • 2021-01-17 06:33

    Use when to check code === rate, if it is matched then replace that column value with array(lit("rate"),lit("new_rate")) & not matched column values array($"code") then explode code column.

    Check below code.

    scala> df.show(false)
    +---+-----+------+------+------+
    |id |code |entity|value1|value2|
    +---+-----+------+------+------+
    |20 |score|school|14    |12    |
    |21 |score|school|13    |13    |
    |22 |rate |school|11    |14    |
    +---+-----+------+------+------+
    
    val colExpr = explode(
        when(
            $"code" === "rate",
            array(
                lit("rate"),
                lit("new_rate")
            )
        )
        .otherwise(array($"code"))
    )
    
    scala> df.withColumn("code",colExpr).show(false)
    +---+--------+------+------+------+
    |id |code    |entity|value1|value2|
    +---+--------+------+------+------+
    |20 |score   |school|14    |12    |
    |21 |score   |school|13    |13    |
    |22 |rate    |school|11    |14    |
    |22 |new_rate|school|11    |14    |
    +---+--------+------+------+------+
    
    0 讨论(0)
提交回复
热议问题