Spark default null columns DataSet

前端未结

关注

 2  1687

一整个雨季 2021-02-06 19:19

I cannot make Spark read a json (or csv for that matter) as Dataset of a case class with Option[_] fields where not all fields are defined

2条回答

别跟我提以往 (楼主)

2021-02-06 19:21

Here is an even simpler solution:

    import org.apache.spark.sql.types.StructType
    import org.apache.spark.sql.DataFrame
    import org.apache.spark.sql.functions._
    import org.apache.spark.sql.catalyst.ScalaReflection
    import scala.reflect.runtime.universe._

val structSchema = ScalaReflection.schemaFor[CustomData].dataType.asInstanceOf[StructType]
val df = spark.read.schema(structSchema).json(jsonRDD)

0 讨论(0)

查看其它2个回答