Can Kafka Streams be configured to wait for KTable to load?

百般思念 提交于 2020-05-11 03:20:07

问题


I'm using materialized KTable to use for left join with my KStream(while the stream is the left side).

However, it seem to process immediately, without waiting for the current version of the KTable to load..

I have a lot of values in my source topic for the KTable and when I start the application, a lot of joins fail(well, not really since it is a left join).

Can I make it start in delay so it would wait for the initial topic load?


回答1:


Processing is time synchronized in Kafka Streams. Hence, the table input topic and stream input topic are processed based on record timestamp order. This is semantically sound, because on a stream-table join, you don't want to join a stream record with an older version nor with a newer version of the KTable, but with the right version based on the stream record timestamp.

If your data is not properly timestamped, you can try to specify a custom timestamp extractor for via builder.table(..., Consumed.with(...)) to return timestamps that ensure proper behavior (ie, maybe smaller than timestamp of the first stream record?)

  • https://docs.confluent.io/current/streams/developer-guide/config-streams.html#streams-developer-guide-timestamp-extractor

Note, that a proper timestamp synchronization requires Kafka Streams 2.1. Older version synchronize time in best effort manner only and may not provide the behavior you want. For more details, see KIP-353.

  • https://cwiki.apache.org/confluence/display/KAFKA/KIP-353%3A+Improve+Kafka+Streams+Timestamp+Synchronization



回答2:


You could use the GlobalKTable. It waits until all values synchronized.



来源:https://stackoverflow.com/questions/56556270/can-kafka-streams-be-configured-to-wait-for-ktable-to-load

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!