I\'ve run into an issue with attempting to parse json in my spark job. I\'m using spark 1.1.0
, json4s
, and the Cassandra Spark Connector
I had the same error when I put the implicit val formats = ...
declaration inside the method which contains the parsing, instead of declaring it on the class (object).
So this would throw an error:
object Application {
//... Lots of other code here, which eventually calls
// setupStream(...)
def setupStream(streamingContext: StreamingContext,
brokers: String,
topologyTopicName: String) = {
implicit val formats = DefaultFormats
_createDStream(streamingContext, brokers, topologyTopicName)
// Remove the message key, which is always null in our case
.map(_._2)
.map((json: String) => parse(json).camelizeKeys
.extract[Record[TopologyMetadata, Unused]])
.print()
}
But this would be fine:
object Application {
implicit val formats = DefaultFormats
//... Lots of other code here, which eventually calls
// setupStream(...)
def setupStream(streamingContext: StreamingContext,
brokers: String,
topologyTopicName: String) = {
_createDStream(streamingContext, brokers, topologyTopicName)
// Remove the message key, which is always null in our case
.map(_._2)
.map((json: String) => parse(json).camelizeKeys
.extract[Record[TopologyMetadata, Unused]])
.print()
}
This was already answered in an open ticket with json4s. The workaround is to put the implicit
declaration inside of the function
val count = rdd
.map(r => {implicit val formats = DefaultFormats; checkUa(r._2, r._1)})
.reduce((x, y) => x + y)