Kafka uncommitted message not getting consumed again

烈酒焚心 提交于 2019-12-13 03:46:29

问题


I am processing kafka messages and inserting into kudu table using spark streaming with manual offset commit here is my code.

val topicsSet = topics.split(",").toSet
val kafkaParams = Map[String, Object](
  ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG -> brokers,
  ConsumerConfig.GROUP_ID_CONFIG -> groupId,
  ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG -> classOf[StringDeserializer],
  ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG -> classOf[StringDeserializer],
  ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG -> (false: java.lang.Boolean),
  ConsumerConfig.AUTO_OFFSET_RESET_CONFIG -> "earliest" //"latest" //"earliest"
 )
val stream = KafkaUtils.createDirectStream[String, String](
                        ssc,
                        PreferConsistent,
                        Subscribe[String, String](topicsSet, kafkaParams)
                       )
stream.foreachRDD { rdd =>
var offsetRanges = rdd.asInstanceOf[HasOffsetRanges].offsetRanges
//offsetRanges.foreach(println)
var msgOffsetsRdd = rdd.map(msg =>{
val msgOffset = OffsetRange(msg.topic(), msg.partition(),  msg.offset(), msg.offset()+1)
        println(msg)
        msgOffset 
      }
    )
   val msgOffsets = msgOffsetsRdd.collect() //here idea was to get only processed messages offsets for commit
   stream.asInstanceOf[CanCommitOffsets].commitAsync(msgOffsets)
}

Let us table this example While inserting data into kudu I got the error I need to process those messages again, if I stop the job and start it again I am able to get uncommitted message can't we get all uncommitted messages in the streaming?


回答1:


You have the message, why don't to put a retry logic in case of failure. Kafka will give you the same message when you reconnect in case your consumer crashes, Not sure if Kafka will give the same message while the connection is still open.

You can have some retry logic in your code if the failure is due to unavailability of destination datastore , Or if insert the failed due incorrect message format, you can save those messages into a temporary cache, datastore or another kafka topic to retry later or examine whats wrong with those messages.



来源:https://stackoverflow.com/questions/55383558/kafka-uncommitted-message-not-getting-consumed-again

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!