I am trying to convert a dataframe of multiple case classes to an rdd of these multiple cases classes. I cant find any solution. This wrappedArray
has drived me cra
You can convert indirectly using Dataset[randomClass3]
:
aDF.select($"_2.*").as[randomClass3].rdd
Spark DatataFrame
/ Dataset[Row]
represents data as the Row
objects using mapping described in Spark SQL, DataFrames and Datasets Guide Any call to getAs
should use this mapping.
For the second column, which is struct<a: string, b: string>
, it would be a Row
as well:
aDF.rdd.map { _.getAs[Row]("_2") }
As commented by Tzach Zohar to get back a full RDD you'll need:
aDF.as[(randomClass2, randomClass3)].rdd
I don't know the scala API but have you considered the rdd value?
Maybe something like :
aDR.rdd.map { case r:Row => r.getAs[randomClass3]("_2")}