What causes this Serialization error in Apache Spark 1.4.0 when calling:
sc.parallelize(strList, 4)
This exception is thrown:
I had the same problem with a project built with Gradle and I excluded the transitive dependencies from the project that was creating the problem:
dependencies
{
compile('dependency.causing:problem:version')
{
exclude module: 'jackson-databind'
}
....
}
That worked perfectly for me.
This worked for me <dependency> excludeAll ExclusionRule(organization = "com.fasterxml.jackson.core")
@Interfector is correct. I ran into this issue also, here's a snippet from my sbt file and the 'dependencyOverrides' section which fixed it.
libraryDependencies ++= Seq(
"com.amazonaws" % "amazon-kinesis-client" % "1.4.0",
"org.apache.spark" %% "spark-core" % "1.4.0",
"org.apache.spark" %% "spark-streaming" % "1.4.0",
"org.apache.spark" %% "spark-streaming-kinesis-asl" % "1.4.0",
"com.amazonaws" % "aws-java-sdk" % "1.10.2"
)
dependencyOverrides ++= Set(
"com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4"
)
I suspect that this is caused by the classpath providing you with a different version of jackson
than the one Spark is expecting (that is 2.4.4 if I'm not mistaking). You will need to adjust your classpath so that the correct jackson
is referenced first for Spark.