Map operation over a KStream fails when not specifying default serdes and using custom ones -> org.apache.kafka.streams.errors.StreamsException

走远了吗. 提交于 2020-01-30 11:53:44

问题


Since I am working with Json values I haven't set up default serdes.

I process a KStream, consuming it with the necessary spring and product (json) serdes, but the next step (map operation) fails:

val props = Properties()
props[StreamsConfig.APPLICATION_ID_CONFIG] = applicationName
props[StreamsConfig.BOOTSTRAP_SERVERS_CONFIG] = kafkaBootstrapServers

val productSerde: Serde<Product> = Serdes.serdeFrom(JsonPojoSerializer<Product>(), JsonPojoDeserializer(Product::class.java))

builder.stream(INVENTORY_TOPIC, Consumed.with(Serdes.String(), productSerde))
            .map { key, value ->
                KeyValue(key, XXX)
            }
            .aggregate(...)

If I remove the map operation the execution goes ok.

I haven't found a way to specify the serdes for the map(), how can it be done?

Error:

Caused by: org.apache.kafka.streams.errors.StreamsException: A serializer (key: org.apache.kafka.common.serialization.ByteArraySerializer / value: org.apache.kafka.common.serialization.ByteArraySerializer) is not compatible to the actual key or value type (key type: java.lang.String / value type: com.codependent.kafkastreams.inventory.dto.Product). Change the default Serdes in StreamConfig or provide correct Serdes via method parameters.
    at org.apache.kafka.streams.processor.internals.SinkNode.process(SinkNode.java:92)

回答1:


Multiple issues:

  1. After you call map() you call groupByKey().aggregate(). This triggers data repartition and thus after map() data is written into an internal topic for data repartitioning. Therefore, you need to provide corresponding Serdes within groupByKey(), too.

  2. However, because you don't modify the key, you should actually call mapValues() instead, to avoid the unnecessary repartitioning.

  3. Note, that you need to provide Serdes for each operator that should not use the default Serde from the config. Serdes are not passed along downstream, but are operator in-place overwrites. (It's work in progress for Kafka 2.1 to improve this.)



来源:https://stackoverflow.com/questions/52026148/map-operation-over-a-kstream-fails-when-not-specifying-default-serdes-and-using

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!