问题
I am trying to create an aggregator wherein I listen for multiple records and consolidate them into one. After consolidation, I wait for a process event by joining a stream and aggregated application in listen() method. On arrival of the process event, some business logic is triggered. I have defined both aggregator and process listener in a single spring boot project.
@Bean
public Function<KStream<FormUUID, FormData>, KStream<UUID, Application>> process()
{
return formEvent -> formEvent.groupByKey()
.reduce((k, v) -> v)
.toStream()
.selectKey((k, v) -> k.getReferenceNo())
.groupByKey()
.aggregate(Application::new, (key, value, aggr) -> aggr.performAggregate(value),
Materialized.<UUID, Application, KeyValueStore<Bytes, byte[]>> as("appStore")
.withKeySerde(new JsonSerde<>(UUID.class))
.withValueSerde(new JsonSerde<>(Application.class)))
.toStream();
}
@Bean
public BiConsumer<KStream<String, ProcessEvent>, KTable<String, Application>> listen()
{
return (eventStream, appTable) ->
{
eventStream.join(appTable, (event, app) -> app)
.foreach((k, app) -> app.createQuote());
};
}
However, now I am facing SerializationException. The first part(aggregation) works fine however the join is failing with exception
java.lang.ClassCastException: com.xxxxx.datamapper.domain.FormData cannot be cast to com.xxxxx.datamapper.domain.Application
at org.apache.kafka.streams.kstream.internals.KStreamPeek$KStreamPeekProcessor.process(KStreamPeek.java:42) ~[kafka-streams-2.3.1.jar:?]
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:117) ~[kafka-streams-2.3.1.jar:?]
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:201) ~[kafka-streams-2.3.1.jar:?]
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:180) ~[kafka-streams-2.3.1.jar:?]
org.apache.kafka.streams.errors.ProcessorStateException: task [0_0] Failed to flush state store APPLICATION_TOPIC-STATE-STORE-0000000001
at org.apache.kafka.streams.processor.internals.ProcessorStateManager.flush(ProcessorStateManager.java:280) ~[kafka-streams-2.3.1.jar:?]
at org.apache.kafka.streams.processor.internals.AbstractTask.flushState(AbstractTask.java:204) ~[kafka-streams-2.3.1.jar:?]
at org.apache.kafka.streams.processor.internals.StreamTask.flushState(StreamTask.java:519) ~[kafka-streams-2.3.1.jar:?]
I think, the problem is in my application.yml. Since the "spring.json.key.default.type" property is set as FormUUID the same is being used for Application object present in listen method. I want to configure the type for remaining types UUID, Application and ProcessEvent in my application.yml. but not sure how to configure the mapping type for each consumer and producer defined.
spring.cloud:
function.definition: process;listen
stream:
kafka.streams:
bindings:
process-in-0.consumer.application-id: form-aggregator
listen-in-0.consumer.application-id: event-processor
listen-in-1.consumer.application-id: event-processor
binder.configuration:
default.key.serde: org.springframework.kafka.support.serializer.JsonSerde
default.value.serde: org.springframework.kafka.support.serializer.JsonSerde
spring.json.key.default.type: com.xxxx.datamapper.domain.FormUUID
spring.json.value.default.type: com.xxxx.datamapper.domain.FormData
commit.interval.ms: 1000
bindings:
process-in-0.destination: FORM_DATA_TOPIC
process-out-0.destination: APPLICATION_TOPIC
listen-in-0.destination: PROCESS_TOPIC
listen-in-1:
destination: APPLICATION_TOPIC
consumer:
useNativeDecoding: true
回答1:
If you are using the latest Horsham versions of Spring Cloud Stream Kafka Streams binder, you do not need to set any explicit Serdes for inbound and outbound. However, you still need to provide them wherever the Kafka Streams API requires them, as in the case of your aggregate method call above. If you are facing this serialization error on the inbound of the second processor, I suggest trying to remove all Serdes from the configuration. You can simplify as it below (given that you are on the latest Horsham release). The binder will infer the correct Serdes to use on the inbound/outbound. One benefit of delegating this to the binder is that you don't need to provide any explicit key/value types through configuration because the binder will introspect for the types. Make sure your POJO types that you are using are JSON friendly. See if that works. If you are still having issues, please create a small sample application where we can reproduce the issue and we will take a look.
spring.cloud:
function.definition: process;listen
stream:
kafka.streams:
bindings:
process-in-0.consumer.application-id: form-aggregator
listen-in-0.consumer.application-id: event-processor
listen-in-1.consumer.application-id: event-processor
binder.configuration:
commit.interval.ms: 1000
bindings:
process-in-0.destination: FORM_DATA_TOPIC
process-out-0.destination: APPLICATION_TOPIC
listen-in-0.destination: PROCESS_TOPIC
listen-in-1.destination: APPLICATION_TOPIC
来源:https://stackoverflow.com/questions/61897937/creating-a-kafka-aggregator-and-joining-it-with-an-event