问题
When we produce messages we can define a callback, this callback can expect an exception:
kafkaProducer.send(producerRecord, new Callback() {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
if (e == null) {
// OK
} else {
// NOT OK
}
}
});
Considered the buitl-in retry logic in the producer, I wonder which kind of exception should developers deal explicitly with?
回答1:
According to the Callback Java Docs there are the following Exception possible happening during callback:
The exception thrown during processing of this record. Null if no error occurred. Possible thrown exceptions include:
Non-Retriable exceptions (fatal, the message will never be sent):
- InvalidTopicException
- OffsetMetadataTooLargeException
- RecordBatchTooLargeException
- RecordTooLargeException
- UnknownServerException
Retriable exceptions (transient, may be covered by increasing #.retries):
- CorruptRecordException
- InchvalidMetadataException
- NotEnoughReplicasAfterAppendException
- NotEnoughReplicasException
- OffsetOutOfRangeException
- TimeoutException
- UnknownTopicOrPartitionException
Maybe this is a unsatisfactory answer, but in the end which Exceptions and how to handle them completely relies on your use case and business requirements.
However, as a developer you also need to deal with the retry mechanism itself of the Kafka Producer. The retries are mainly driven by:
retries: Setting a value greater than zero will cause the client to resend any record whose send fails with a potentially transient error. Note that this retry is no different than if the client resent the record upon receiving the error. Allowing retries without setting max.in.flight.requests.per.connection (default: 5) to 1 will potentially change the ordering of records because if two batches are sent to a single partition, and the first fails and is retried but the second succeeds, then the records in the second batch may appear first. Note additionally that produce requests will be failed before the number of retries has been exhausted if the timeout configured by delivery.timeout.ms expires first before successful acknowledgement. Users should generally prefer to leave this config unset and instead use delivery.timeout.ms to control retry behavior.
retry.backoff.ms: The amount of time to wait before attempting to retry a failed request to a given topic partition. This avoids repeatedly sending requests in a tight loop under some failure scenarios.
回答2:
You may get BufferExhaustedException or TimeoutException
Just bring your Kafka down after the producer has produced one record. And then continue producing records. After sometime, you should be seeing exceptions in the callback.
This is because, when you sent the first record, the metadata is fetched, after that, the records will be batched and buffered and they expire eventually after some timeout during which you may see these exceptions.
I suppose that the timeout is delivery.timeout.ms which when expired give you a TimeoutException
exception there.
回答3:
Trying to add more info to @Mike's answer, I think only a few Exceptions are enum in Callback Interface.
Here you can see the whole list: kafka.common.errors
And here, you can see which ones are retriables and which ones are not: kafka protocol guide
And the code could be sht like this:
producer.send(record, callback)
def callback: Callback = new Callback {
override def onCompletion(recordMetadata: RecordMetadata, e: Exception): Unit = {
if(null != e) {
if (e == RecordTooLargeException || e == UnknownServerException || ..) {
log.error("Winter is comming") //it's non-retriable
writeDiscardRecordsToSomewhereElse
} else {
log.warn("It's no that cold") //it's retriable
}
} else {
log.debug("It's summer. Everything is fine")
}
}
}
来源:https://stackoverflow.com/questions/61654578/kafka-producer-callback-exception