Spark non-serializable exception when parsing JSON with json4s

前端 未结 2 422
慢半拍i
慢半拍i 2021-01-04 11:42

I\'ve run into an issue with attempting to parse json in my spark job. I\'m using spark 1.1.0, json4s, and the Cassandra Spark Connector

2条回答
  •  悲哀的现实
    2021-01-04 12:18

    I had the same error when I put the implicit val formats = ... declaration inside the method which contains the parsing, instead of declaring it on the class (object).

    So this would throw an error:

    object Application {
    
      //... Lots of other code here, which eventually calls 
      // setupStream(...)
    
      def setupStream(streamingContext: StreamingContext,
                              brokers: String,
                              topologyTopicName: String) = {
        implicit val formats = DefaultFormats
        _createDStream(streamingContext, brokers, topologyTopicName)
          // Remove the message key, which is always null in our case
          .map(_._2)
          .map((json: String) => parse(json).camelizeKeys
            .extract[Record[TopologyMetadata, Unused]])
          .print()
    }
    

    But this would be fine:

    object Application {
    
      implicit val formats = DefaultFormats
    
      //... Lots of other code here, which eventually calls 
      // setupStream(...)
    
      def setupStream(streamingContext: StreamingContext,
                              brokers: String,
                              topologyTopicName: String) = {
        _createDStream(streamingContext, brokers, topologyTopicName)
          // Remove the message key, which is always null in our case
          .map(_._2)
          .map((json: String) => parse(json).camelizeKeys
            .extract[Record[TopologyMetadata, Unused]])
          .print()
    }
    

提交回复
热议问题