Spark default null columns DataSet

前端 未结 2 1687
一整个雨季
一整个雨季 2021-02-06 19:19

I cannot make Spark read a json (or csv for that matter) as Dataset of a case class with Option[_] fields where not all fields are defined

2条回答
  •  别跟我提以往
    2021-02-06 19:21

    Here is an even simpler solution:

        import org.apache.spark.sql.types.StructType
        import org.apache.spark.sql.DataFrame
        import org.apache.spark.sql.functions._
        import org.apache.spark.sql.catalyst.ScalaReflection
        import scala.reflect.runtime.universe._
    
    val structSchema = ScalaReflection.schemaFor[CustomData].dataType.asInstanceOf[StructType]
    val df = spark.read.schema(structSchema).json(jsonRDD)
    

提交回复
热议问题