hadoop writables NotSerializableException with Apache Spark API

后端 未结 2 1588
遥遥无期
遥遥无期 2021-02-08 07:02

Spark Java application throws NotSerializableException on hadoop writables.

public final class myAPP {
  public static void main(String[] args) throws Exception          


        
相关标签:
2条回答
  • 2021-02-08 07:08

    As of Spark v1.4.0, you can use this Java API to register classes to be serialized using Kryo: https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkConf.html#registerKryoClasses(java.lang.Class[]) , by passing in an array of Class objects, each of which can be obtained using http://docs.oracle.com/javase/7/docs/api/java/lang/Class.html#forName(java.lang.String)

    such as:

    new SparkConf().registerKryoClasses(new Class<?>[]{
        Class.forName("org.apache.hadoop.io.LongWritable"),
        Class.forName("org.apache.hadoop.io.Text")
    });
    

    Hope this helps.

    0 讨论(0)
  • 2021-02-08 07:09

    use

    sparkConf.set("spark.kryo.classesToRegister", "org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text")
    

    or you can simply use

    ctx.textFile(args[0]);
    

    to load RDD

    0 讨论(0)
提交回复
热议问题