Task not serializable: java.io.NotSerializableException when calling function outside closure only on classes not objects

后端 未结 9 1513
悲&欢浪女
悲&欢浪女 2020-11-22 05:29

Getting strange behavior when calling function outside of a closure:

  • when function is in a object everything is working
  • when function is in a class ge
9条回答
  •  北海茫月
    2020-11-22 05:45

    I faced similar issue, and what I understand from Grega's answer is

    object NOTworking extends App {
     new testing().doIT
    }
    //adding extends Serializable wont help
    class testing {
    
    val list = List(1,2,3)
    
    val rddList = Spark.ctx.parallelize(list)
    
    def doIT =  {
      //again calling the fucntion someFunc 
      val after = rddList.map(someFunc(_))
      //this will crash (spark lazy)
      after.collect().map(println(_))
    }
    
    def someFunc(a:Int) = a+1
    
    }
    

    your doIT method is trying to serialize someFunc(_) method, but as method are not serializable, it tries to serialize class testing which is again not serializable.

    So make your code work, you should define someFunc inside doIT method. For example:

    def doIT =  {
     def someFunc(a:Int) = a+1
      //function definition
     }
     val after = rddList.map(someFunc(_))
     after.collect().map(println(_))
    }
    

    And if there are multiple functions coming into picture, then all those functions should be available to the parent context.

提交回复
热议问题