Spark Parallelize? (Could not find creator property with name 'id')

前端 未结 4 1569
灰色年华
灰色年华 2020-12-30 01:21

What causes this Serialization error in Apache Spark 1.4.0 when calling:

sc.parallelize(strList, 4)

This exception is thrown:



        
相关标签:
4条回答
  • 2020-12-30 02:05

    I had the same problem with a project built with Gradle and I excluded the transitive dependencies from the project that was creating the problem:

    dependencies
    {
        compile('dependency.causing:problem:version')
        {
            exclude module: 'jackson-databind'
        }
    
    ....
    
    }
    

    That worked perfectly for me.

    0 讨论(0)
  • 2020-12-30 02:13

    This worked for me <dependency> excludeAll ExclusionRule(organization = "com.fasterxml.jackson.core")

    0 讨论(0)
  • 2020-12-30 02:20

    @Interfector is correct. I ran into this issue also, here's a snippet from my sbt file and the 'dependencyOverrides' section which fixed it.

    libraryDependencies ++= Seq(
      "com.amazonaws" % "amazon-kinesis-client" % "1.4.0",
      "org.apache.spark" %% "spark-core" % "1.4.0",
      "org.apache.spark" %% "spark-streaming" % "1.4.0",
      "org.apache.spark" %% "spark-streaming-kinesis-asl" % "1.4.0",
      "com.amazonaws" % "aws-java-sdk" % "1.10.2"
    )
    
    dependencyOverrides ++= Set(
      "com.fasterxml.jackson.core" % "jackson-databind" % "2.4.4"
    )
    
    0 讨论(0)
  • 2020-12-30 02:22

    I suspect that this is caused by the classpath providing you with a different version of jackson than the one Spark is expecting (that is 2.4.4 if I'm not mistaking). You will need to adjust your classpath so that the correct jackson is referenced first for Spark.

    0 讨论(0)
提交回复
热议问题