Spark Java application throws NotSerializableException on hadoop writables.
public final class myAPP {
public static void main(String[] args) throws Exception
As of Spark v1.4.0, you can use this Java API to register classes to be serialized using Kryo: https://spark.apache.org/docs/latest/api/java/org/apache/spark/SparkConf.html#registerKryoClasses(java.lang.Class[]) , by passing in an array of Class objects, each of which can be obtained using http://docs.oracle.com/javase/7/docs/api/java/lang/Class.html#forName(java.lang.String)
such as:
new SparkConf().registerKryoClasses(new Class<?>[]{
Class.forName("org.apache.hadoop.io.LongWritable"),
Class.forName("org.apache.hadoop.io.Text")
});
Hope this helps.
use
sparkConf.set("spark.kryo.classesToRegister", "org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text")
or you can simply use
ctx.textFile(args[0]);
to load RDD