Use the same topic as a source more than once with Kafka Streams DSL

浪子不回头ぞ 提交于 2020-12-31 04:36:56

问题


Is there a way to use the same topic as the source for two different processing routines, when using Kafka Streams DSL?

StreamsBuilder streamsBuilder = new StreamsBuilder();

// use the topic as a stream
streamsBuilder.stream("topic")...

// use the same topic as a source for KTable
streamsBuilder.table("topic")...

return streamsBuilder.build();

Naive implementation from above throws a TopologyException at runtime: Invalid topology: Topic topic has already been registered by another source. Which is totally valid, if we dive into underlying Processor API. Is using it the only way out?

UPDATE: The closest alternative I've found so far:

StreamsBuilder streamsBuilder = new StreamsBuilder();

final KStream<Object, Object> stream = streamsBuilder.stream("topic");

// use the topic as a stream
stream...

// create a KTable from the KStream
stream.groupByKey().reduce((oldValue, newValue) -> newValue)...

return streamsBuilder.build();

回答1:


Reading the same topic as stream and as table is semantically questionable IMHO. Streams model immutable facts, while changelog topic that you would use to read into a KTable model updates.

If you want to use a single topic in multiple streams, you can reuse the same KStream object multiple times (it's semantically like a broadcast):

KStream stream = ...
stream.filter();
stream.map();

Also compare: https://issues.apache.org/jira/browse/KAFKA-6687 (there are plans to remove this restriction. I doubt, we will allow to use one topic as KStream and KTable at the same time though—compare my comment from above).




回答2:


yes, you can, but for that you need to have multiple StreamsBuilder

StreamsBuilder streamsBuilder1 = new StreamsBuilder();
streamsBuilder1.stream("topic");

StreamsBuilder streamsBuilder2 = new StreamsBuilder();
streamsBuilder2.table("topic");

Topology topology1 = streamsBuilder1.build();
Topology topology2 = streamsBuilder2.build();

KafkaStreams kafkaStreams1 = new KafkaStreams(topology1, streamsConfig1);
KafkaStreams kafkaStreams2 = new KafkaStreams(topology2, streamsConfig2);

Also make sure that you have different application.id values for each of StreamsConfig



来源:https://stackoverflow.com/questions/52426744/use-the-same-topic-as-a-source-more-than-once-with-kafka-streams-dsl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!