问题
Is there a way to use the same topic as the source for two different processing routines, when using Kafka Streams DSL?
StreamsBuilder streamsBuilder = new StreamsBuilder();
// use the topic as a stream
streamsBuilder.stream("topic")...
// use the same topic as a source for KTable
streamsBuilder.table("topic")...
return streamsBuilder.build();
Naive implementation from above throws a TopologyException
at runtime: Invalid topology: Topic topic has already been registered by another source. Which is totally valid, if we dive into underlying Processor API. Is using it the only way out?
UPDATE: The closest alternative I've found so far:
StreamsBuilder streamsBuilder = new StreamsBuilder();
final KStream<Object, Object> stream = streamsBuilder.stream("topic");
// use the topic as a stream
stream...
// create a KTable from the KStream
stream.groupByKey().reduce((oldValue, newValue) -> newValue)...
return streamsBuilder.build();
回答1:
Reading the same topic as stream and as table is semantically questionable IMHO. Streams model immutable facts, while changelog topic that you would use to read into a KTable model updates.
If you want to use a single topic in multiple streams, you can reuse the same KStream
object multiple times (it's semantically like a broadcast):
KStream stream = ...
stream.filter();
stream.map();
Also compare: https://issues.apache.org/jira/browse/KAFKA-6687 (there are plans to remove this restriction. I doubt, we will allow to use one topic as KStream
and KTable
at the same time though—compare my comment from above).
回答2:
yes, you can, but for that you need to have multiple StreamsBuilder
StreamsBuilder streamsBuilder1 = new StreamsBuilder();
streamsBuilder1.stream("topic");
StreamsBuilder streamsBuilder2 = new StreamsBuilder();
streamsBuilder2.table("topic");
Topology topology1 = streamsBuilder1.build();
Topology topology2 = streamsBuilder2.build();
KafkaStreams kafkaStreams1 = new KafkaStreams(topology1, streamsConfig1);
KafkaStreams kafkaStreams2 = new KafkaStreams(topology2, streamsConfig2);
Also make sure that you have different application.id
values for each of StreamsConfig
来源:https://stackoverflow.com/questions/52426744/use-the-same-topic-as-a-source-more-than-once-with-kafka-streams-dsl