Creating a Kafka aggregator and joining it with an event

[亡魂溺海] 提交于 2020-06-01 05:05:41

问题


I am trying to create an aggregator wherein I listen for multiple records and consolidate them into one. After consolidation, I wait for a process event by joining a stream and aggregated application in listen() method. On arrival of the process event, some business logic is triggered. I have defined both aggregator and process listener in a single spring boot project.

@Bean
    public Function<KStream<FormUUID, FormData>, KStream<UUID, Application>> process()
    {
        return formEvent -> formEvent.groupByKey()
                .reduce((k, v) -> v)
                .toStream()
                .selectKey((k, v) -> k.getReferenceNo())
                .groupByKey()
                .aggregate(Application::new, (key, value, aggr) -> aggr.performAggregate(value),
                        Materialized.<UUID, Application, KeyValueStore<Bytes, byte[]>> as("appStore")
                                .withKeySerde(new JsonSerde<>(UUID.class))
                                .withValueSerde(new JsonSerde<>(Application.class)))
                .toStream();
    }

    @Bean
    public BiConsumer<KStream<String, ProcessEvent>, KTable<String, Application>> listen()
    {

        return (eventStream, appTable) -> 
        {
            eventStream.join(appTable, (event, app) -> app)
                    .foreach((k, app) -> app.createQuote());
        };

    }

However, now I am facing SerializationException. The first part(aggregation) works fine however the join is failing with exception

java.lang.ClassCastException: com.xxxxx.datamapper.domain.FormData cannot be cast to com.xxxxx.datamapper.domain.Application
at org.apache.kafka.streams.kstream.internals.KStreamPeek$KStreamPeekProcessor.process(KStreamPeek.java:42) ~[kafka-streams-2.3.1.jar:?]
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117) ~[kafka-streams-2.3.1.jar:?]
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:201) ~[kafka-streams-2.3.1.jar:?]
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:180) ~[kafka-streams-2.3.1.jar:?]

org.apache.kafka.streams.errors.ProcessorStateException: task [0_0] Failed to flush state store APPLICATION_TOPIC-STATE-STORE-0000000001
    at org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:280) ~[kafka-streams-2.3.1.jar:?]
    at org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:204) ~[kafka-streams-2.3.1.jar:?]
    at org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:519) ~[kafka-streams-2.3.1.jar:?]

I think, the problem is in my application.yml. Since the "spring.json.key.default.type" property is set as FormUUID the same is being used for Application object present in listen method. I want to configure the type for remaining types UUID, Application and ProcessEvent in my application.yml. but not sure how to configure the mapping type for each consumer and producer defined.

spring.cloud:
 function.definition: process;listen
 stream:
  kafka.streams:
    bindings:
      process-in-0.consumer.application-id: form-aggregator
      listen-in-0.consumer.application-id: event-processor
      listen-in-1.consumer.application-id: event-processor
    binder.configuration:
      default.key.serde: org.springframework.kafka.support.serializer.JsonSerde
      default.value.serde: org.springframework.kafka.support.serializer.JsonSerde
      spring.json.key.default.type: com.xxxx.datamapper.domain.FormUUID
      spring.json.value.default.type: com.xxxx.datamapper.domain.FormData
      commit.interval.ms: 1000
  bindings:
    process-in-0.destination: FORM_DATA_TOPIC
    process-out-0.destination: APPLICATION_TOPIC
    listen-in-0.destination: PROCESS_TOPIC
    listen-in-1: 
      destination: APPLICATION_TOPIC
      consumer:
       useNativeDecoding: true

回答1:


If you are using the latest Horsham versions of Spring Cloud Stream Kafka Streams binder, you do not need to set any explicit Serdes for inbound and outbound. However, you still need to provide them wherever the Kafka Streams API requires them, as in the case of your aggregate method call above. If you are facing this serialization error on the inbound of the second processor, I suggest trying to remove all Serdes from the configuration. You can simplify as it below (given that you are on the latest Horsham release). The binder will infer the correct Serdes to use on the inbound/outbound. One benefit of delegating this to the binder is that you don't need to provide any explicit key/value types through configuration because the binder will introspect for the types. Make sure your POJO types that you are using are JSON friendly. See if that works. If you are still having issues, please create a small sample application where we can reproduce the issue and we will take a look.

spring.cloud:
 function.definition: process;listen
 stream:
  kafka.streams:
    bindings:
      process-in-0.consumer.application-id: form-aggregator
      listen-in-0.consumer.application-id: event-processor
      listen-in-1.consumer.application-id: event-processor
    binder.configuration:
      commit.interval.ms: 1000
  bindings:
    process-in-0.destination: FORM_DATA_TOPIC
    process-out-0.destination: APPLICATION_TOPIC
    listen-in-0.destination: PROCESS_TOPIC
    listen-in-1.destination: APPLICATION_TOPIC


来源:https://stackoverflow.com/questions/61897937/creating-a-kafka-aggregator-and-joining-it-with-an-event

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!