问题
I have a working setup for Spring Cloud Kafka Streams with functional programming style.
There are two use cases, which are configured via application.properties
.
Both of them work individually, but as soon as I activate both at the same time, I get a serialization error for the output stream of the second use case:
Exception in thread "ActivitiesAppId-05296224-5ea1-412a-aee4-1165870b5c75-StreamThread-1" org.apache.kafka.streams.errors.StreamsException:
Error encountered sending record to topic outputActivities for task 0_0 due to:
...
Caused by: org.apache.kafka.common.errors.SerializationException:
Can't serialize data [com.example.connector.model.Activity@497b37ff] for topic [outputActivities]
Caused by: com.fasterxml.jackson.databind.exc.InvalidDefinitionException:
Incompatible types: declared root type ([simple type, class com.example.connector.model.Material]) vs com.example.connector.model.Activity
The last line here is important, as the "declared root type" is from the Material
class, but not the Activity
class, which is probably the source error.
Again, when I only activate the second use case before starting the application, everything works fine. So I assume that the "Material" processor somehow interfers with the "Activities" processor (or its serializer), but I don't know when and where.
Setup
1.) use case: "Materials"
- one input stream -> transformation -> one output stream
@Bean
public Function<KStream<String, MaterialRaw>, KStream<String, Material>> processMaterials() {...}
application.properties
spring.cloud.stream.kafka.streams.binder.functions.processMaterials.applicationId=MaterialsAppId
spring.cloud.stream.bindings.processMaterials-in-0.destination=inputMaterialsRaw
spring.cloud.stream.bindings.processMaterials-out-0.destination=outputMaterials
2.) use case: "Activities"
- two input streams -> joining -> one output stream
@Bean
public BiFunction<KStream<String, ActivityRaw>, KStream<String, Assignee>, KStream<String, Activity>> processActivities() {...}
application.properties
spring.cloud.stream.kafka.streams.binder.functions.processActivities.applicationId=ActivitiesAppId
spring.cloud.stream.bindings.processActivities-in-0.destination=inputActivitiesRaw
spring.cloud.stream.bindings.processActivities-in-1.destination=inputAssignees
spring.cloud.stream.bindings.processActivities-out-0.destination=outputActivities
The two processors are also defined as stream function in application.properties
: spring.cloud.stream.function.definition=processActivities;processMaterials
Thanks!
Update - Here's how I use the processors in the code:
Implementation
// Material model
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class MaterialRaw {
private String id;
private String name;
}
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class Material {
private String id;
private String name;
}
// Material processor
@Bean
public Function<KStream<String, MaterialRaw>, KStream<String, Material>> processMaterials() {
return materialsRawStream -> materialsRawStream .map((recordKey, materialRaw) -> {
// some transformation
final var newId = materialRaw.getId() + "---foo";
final var newName = materialRaw.getName() + "---bar";
final var material = new Material(newId, newName);
// output
return new KeyValue<>(recordKey, material);
};
}
// Activity model
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class ActivityRaw {
private String id;
private String name;
}
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class Assignee {
private String id;
private String assignedAt;
}
/**
* Combination of `ActivityRaw` and `Assignee`
*/
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
public class Activity {
private String id;
private Integer number;
private String assignedAt;
}
// Activity processor
@Bean
public BiFunction<KStream<String, ActivityRaw>, KStream<String, Assignee>, KStream<String, Activity>> processActivities() {
return (activitiesRawStream, assigneesStream) -> {
final var joinWindow = JoinWindows.of(Duration.ofDays(30));
final var streamJoined = StreamJoined.with(
Serdes.String(),
new JsonSerde<>(ActivityRaw.class),
new JsonSerde<>(Assignee.class)
);
final var joinedStream = activitiesRawStream.leftJoin(
assigneesStream,
new ActivityJoiner(),
joinWindow,
streamJoined
);
final var mappedStream = joinedStream.map((recordKey, activity) -> {
return new KeyValue<>(recordKey, activity);
});
return mappedStream;
};
}
回答1:
This turns out to be an issue with the way the binder infers Serde
types when there are multiple functions with different outbound target types, one with Activity
and another with Material
in your case. We will have to address this in the binder. I created an issue here.
In the meantime, you can follow this workaround.
Create a custom Serde
class as below.
public class ActivitySerde extends JsonSerde<Activity> {}
Then, explicitly use this Serde
for the outbound of your processActivities
function using configuration.
For e.g.,
spring.cloud.stream.kafka.streams.bindings.processActivities-out-0.producer.valueSerde=com.example.so65003575.ActivitySerde
Please change the package to the appropriate one if you are trying this workaround.
Here is another recommended approach. If you define a bean of type Serde
with the target type, that takes precedence as the binder will do a match against the KStream
type. Therefore, you can also do it without defining that extra class in the above workaround.
@Bean
public Serde<Activity> activitySerde() {
return new JsonSerde(Activity.class);
}
Here are the docs where it explains all these details.
回答2:
You need to specify which binder to use for each function s.c.s.bindings.xxx.binder=...
.
However, without that, I would have expected an error such as "multiple binders found but no default specified", which is what happens with message channel binders.
来源:https://stackoverflow.com/questions/65003575/spring-cloud-kafka-cant-serialize-data-for-output-stream-when-two-processors-a