I have a Spark Scala dataframe with a nested structure:
|-- _History: struct (nullable = true)
| |-- Article:
The simplest approach is to use type casting with properly named schema string (or equivalent StructField
definition):
val schema = """struct<
Article: array>,
Channel: struct>>"""
df.withColumn("_History", $"_History".cast(schema))
You could also model this with case classes:
import org.apache.spark.sql.Row
case class ChannelRecord(Cultura: Option[Long], Deoprtes: Option[Seq[Long]])
val rename = udf((row: Row) =>
ChannelRecord(Option(row.getLong(0)), Option(row.getSeq[Long](1))))
df.withColumn("_History",
struct($"_History.Article", rename($"_History.channel").alias("channel")))