apache-kafka-streams | 易学教程

KTable unable fetch data from Materialized view

阅读更多关于 KTable unable fetch data from Materialized view

问题 I am using Kafka Streams with Spring Boot. In my use case when I receive customer event I need to store it in customer-store materialized view and when I receive order event, I need to join customer and order then store the result in customer-order materialized view. StoreBuilder customerStateStore = Stores.keyValueStoreBuilder(Stores.persistentKeyValueStore("customer-store"),Serdes.String(), customerSerde) .withLoggingEnabled(new HashMap<>()); streamsBuilder.stream("customer", Consumed.with

KTable unable fetch data from Materialized view

阅读更多关于 KTable unable fetch data from Materialized view

External processing using Kafka Streams

阅读更多关于 External processing using Kafka Streams

问题 There are several questions regarding message enrichment using external data, and the recommendation is almost always the same: ingest external data using Kafka Connect and then join the records using state stores. Although it fits in most cases, there are several other use cases in which it does not, such as IP to location and user agent detection, to name a few. Enriching a message with an IP-based location usually requires a lookup by a range of IPs, but currently, there is no built-in

What are internal topics used in Kafka?

阅读更多关于 What are internal topics used in Kafka?

问题 We are using kafka stream api for aggregation in which we are also using group by. We are also using state store where it saves the input topics data. What i notice is Kafka internally creates 3 kinds of topic Changelog-<storeid>-<partition> Repartition-<storeid>-<partition> <topicname>-<partition> What I am not able to understand is Why it creates changelog topic when I have all the data in <topic>-<partition> Does repartition topic contains data after grouping. and I see that the size of

Kafka Streams: Any guarantees on ordering of saves to state stores when using at_least_once?

阅读更多关于 Kafka Streams: Any guarantees on ordering of saves to state stores when using at_least_once?

问题 We have a Kafka Streams Java topology built with the Processor API. In the topology, we have a single processor, that saves to multiple state stores. As we use at_least_once, we would expect to see some inconsistencies between the state stores - e.g. an incoming record results in writes to both state store A and B, but a crash between the saves results in only the save to store A getting written to the Kafka change log topic. Are we guaranteed that the order in which we save will also be the

How does Kafka Stream send final aggregation with KTable#Suppress?

阅读更多关于 How does Kafka Stream send final aggregation with KTable#Suppress?

问题 What I'd like to do is this: Consume records from a topic count the values for each 1 sec window detect window whose records num < 4 Send the FINAL result to another topic I use suppress to send final result, but I got an error like this. 09:18:07,963 ERROR org.apache.kafka.streams.processor.internals.ProcessorStateManager - task [1_0] Failed to flush state store KSTREAM-AGGREGATE-STATE-STORE-0000000002: java.lang.ClassCastException: org.apache.kafka.streams.kstream.Windowed cannot be cast to

Kafka Streams (Suppress): Closing a TimeWindow by timeout

阅读更多关于 Kafka Streams (Suppress): Closing a TimeWindow by timeout

问题 I have the following piece of code to aggregate data hourly based on event time KStream<Windowed<String>, SomeUserDefinedClass> windowedResults = inputStream .groupByKey(Grouped.with(Serdes.String(), new SomeUserDefinedSerde<>())) .windowedBy(TimeWindows.of(Duration.ofMinutes(60)).grace(Duration.ofMinutes(15))) .aggregate ( // do some aggregation ) .suppress(Suppressed.untilTimeLimit(Duration.ofMinutes(75), Suppressed.BufferConfig.unbounded())) .toStream(); The issue is that I am unable to

Update KTable based on partial data attributes

阅读更多关于 Update KTable based on partial data attributes

问题 I am trying to update a KTable with partial data of an object. Eg. User object is {"id":1, "name":"Joe", "age":28} The object is being streamed into a topic and grouped by key into KTable. Now the user object is updated partially as follows {"id":1, "age":33} and streamed into table. But the updated table looks as follows {"id":1, "name":null, "age":28} . The expected output is {"id":1, "name":"Joe", "age":33} . How can I use Kafka streams and spring cloud streams to achieve the expected

InvalidStateStoreException: the state store is not open in Kafka streams

阅读更多关于 InvalidStateStoreException: the state store is not open in Kafka streams

问题 StreamsBuilder builder = new StreamsBuilder(); Map<String, ?> serdeConfig = Collections.singletonMap(SCHEMA_REGISTRY_URL_CONFIG, schemaRegistryUrl); Serde keySerde= getSerde(keyClass); keySerde.configure(serdeConfig,true); Serde valueSerde = getSerde(valueClass); valueSerde.configure(serdeConfig,false); StoreBuilder<KeyValueStore<K,V>> store = Stores.keyValueStoreBuilder( Stores.persistentKeyValueStore("mystore"), keySerde,valueSerde).withCachingEnabled(); builder.addGlobalStore(store,

Kafka Streams rebalancing latency spikes on high throughput kafka-streams services

阅读更多关于 Kafka Streams rebalancing latency spikes on high throughput kafka-streams services

问题 we are starting to work with Kafka streams, our service is a very simple stateless consumer. We have tight requirements on latency, and we are facing too high latency problems when the consumer group is rebalancing. In our scenario, rebalancing will happen relatively often: rolling updates of code, scaling up/down the service, containers being shuffled by the cluster scheduler, containers dying, hardware failing. One of the first tests we have done is having a small consumer group with 4