I am using spark-sql-2.4.1v with java8 version. I have a scenario where I need to copy current row and create another row modifying few columns data how can this be achieved
you can use this approach for your scenario,
df.union(df.filter($"code"==="rate").withColumn("code",concat(lit("new_"), $"code"))).show()
/*
+---+--------+------+------+------+
| id| code|entity|value1|value2|
+---+--------+------+------+------+
| 20| score|school| 14| 12|
| 21| score|school| 13| 13|
| 22| rate|school| 11| 14|
| 22|new_rate|school| 11| 14|
+---+--------+------+------+------+
*/
Use when
to check code === rate
, if it is matched then replace that column value with array(lit("rate"),lit("new_rate"))
& not matched column values array($"code")
then explode code
column.
Check below code.
scala> df.show(false)
+---+-----+------+------+------+
|id |code |entity|value1|value2|
+---+-----+------+------+------+
|20 |score|school|14 |12 |
|21 |score|school|13 |13 |
|22 |rate |school|11 |14 |
+---+-----+------+------+------+
val colExpr = explode(
when(
$"code" === "rate",
array(
lit("rate"),
lit("new_rate")
)
)
.otherwise(array($"code"))
)
scala> df.withColumn("code",colExpr).show(false)
+---+--------+------+------+------+
|id |code |entity|value1|value2|
+---+--------+------+------+------+
|20 |score |school|14 |12 |
|21 |score |school|13 |13 |
|22 |rate |school|11 |14 |
|22 |new_rate|school|11 |14 |
+---+--------+------+------+------+