I\'m migrating some code from Spark 1.6 to Spark 2.1 and struggling with the following issue:
This worked perfectly in Spark 1.6
import org.apache.spark.
The error message is clear which says that Some
is used when bigint
is required
scala.Some is not a valid external type for schema of bigint
So you need to use Option
combining with getOrElse
so that we can define null
when Option
returns nullpointer
. The following code should work for you
val sc = ss.sparkContext
val sqlContext = ss.sqlContext
val schema = StructType(Seq(StructField("i", LongType,nullable=true)))
val rows = sc.parallelize(Seq(Row(Option(1L) getOrElse(null))))
sqlContext.createDataFrame(rows,schema).show
I hope this answer is helpful
There is actually an JIRA SPARK-19056 about this issue which is not actually one.
So this behavior is intentional.
Allowing
Option
inRow
is never documented and brings a lot of troubles when we apply the encoder framework to all typed operations. Since Spark 2.0, please useDataset
for typed operation/custom objects. e.g.
val ds = Seq(1 -> None, 2 -> Some("str")).toDS
ds.toDF // schema: <_1: int, _2: string>