问题
I am processing kafka messages and inserting into kudu table using spark streaming with manual offset commit here is my code.
val topicsSet = topics.split(",").toSet
val kafkaParams = Map[String, Object](
ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG -> brokers,
ConsumerConfig.GROUP_ID_CONFIG -> groupId,
ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG -> classOf[StringDeserializer],
ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG -> classOf[StringDeserializer],
ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG -> (false: java.lang.Boolean),
ConsumerConfig.AUTO_OFFSET_RESET_CONFIG -> "earliest" //"latest" //"earliest"
)
val stream = KafkaUtils.createDirectStream[String, String](
ssc,
PreferConsistent,
Subscribe[String, String](topicsSet, kafkaParams)
)
stream.foreachRDD { rdd =>
var offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
//offsetRanges.foreach(println)
var msgOffsetsRdd = rdd.map(msg =>{
val msgOffset = OffsetRange(msg.topic(), msg.partition(), msg.offset(), msg.offset()+1)
println(msg)
msgOffset
}
)
val msgOffsets = msgOffsetsRdd.collect() //here idea was to get only processed messages offsets for commit
stream.asInstanceOf[CanCommitOffsets].commitAsync(msgOffsets)
}
Let us table this example While inserting data into kudu I got the error I need to process those messages again, if I stop the job and start it again I am able to get uncommitted message can't we get all uncommitted messages in the streaming?
回答1:
You have the message, why don't to put a retry logic in case of failure. Kafka will give you the same message when you reconnect in case your consumer crashes, Not sure if Kafka will give the same message while the connection is still open.
You can have some retry logic in your code if the failure is due to unavailability of destination datastore , Or if insert the failed due incorrect message format, you can save those messages into a temporary cache, datastore or another kafka topic to retry later or examine whats wrong with those messages.
来源:https://stackoverflow.com/questions/55383558/kafka-uncommitted-message-not-getting-consumed-again