Spark: java.io.NotSerializableException: org.apache.avro.Schema$RecordSchema

前端 未结 2 969
悲&欢浪女
悲&欢浪女 2021-02-10 01:43

I am creating avro RDD with following code.

 def convert2Avro(data : String ,schema : Schema)  : AvroKey[GenericRecord] = {
   var wrap         


        
2条回答
  •  误落风尘
    2021-02-10 02:12

    Schema.ReocrdSchema class has not implemented serializable. So it could not transferred over the network. We can convert the schema to string and pass to method and inside the method reconstruct the schema object.

    var schemaString = schema.toString
    var avroRDD = fieldsRDD.map(x =>(convert2Avro(x, schemaString)))
    

    Inside the method reconstruct the schema:

    def convert2Avro(data : String ,schemaString : String)  : AvroKey[GenericRecord] = {
       var schema = parser.parse(schemaString)
       var wrapper = new AvroKey[GenericRecord]()
       var record = new GenericData.Record(schema)
       record.put("empname","John")
        wrapper.datum(record)
        return wrapper 
      }
    

提交回复
热议问题