apache-kafka-streams | 易学教程

Spring @StreamListener process(KStream<?,?> stream) Partition

阅读更多关于 Spring @StreamListener process(KStream stream) Partition

问题 I have a topic with multiple partitions in my stream processor i just wanted to stream that from one partition, and could nto figure out how to configure this spring.cloud.stream.kafka.streams.bindings.input.consumer.application-id=s-processor spring.cloud.stream.bindings.input.destination=uinput spring.cloud.stream.bindings.input.group=r-processor spring.cloud.stream.bindings.input.contentType=application/java-serialized-object spring.cloud.stream.bindings.input.consumer.header-mode=raw

Is KSQL making remote requests under the hood, or is a Table actually a global KTable?

阅读更多关于 Is KSQL making remote requests under the hood, or is a Table actually a global KTable?

问题 I have a Kafka topic containing customer records, called "customer-created". Each customer is a new record in the topic. There are 4 partitions. I have two ksql-server instances running, based on the docker image confluentinc/cp-ksql-server:5.3.0 . Both use the same KSQL Service Id. I've created a table: CREATE TABLE t_customer (id VARCHAR, firstname VARCHAR, lastname VARCHAR) WITH (KAFKA_TOPIC = 'customer-created', VALUE_FORMAT='JSON', KEY = 'id'); I'm new to KSQL, but my understanding was

Kafka Streams for count a total num?

阅读更多关于 Kafka Streams for count a total num?

问题 A topic named "addcash" which has 3 partitions(the number of the kafka cluster machines is 3 too), and a lot of user recharge messages flow in it. I want to count the total money num everyday. I learned from some articles about Kafka Streams: The Kafka Streams will run the topology as task, and the number of the task depend on the number of the topic's partitions, and every task has individual state store. So when I count the total money num by state stroe, Is there three values, not a total

Kafka Streams - Explain the reason why KTable and its associated Store only get updated every 30 seconds

阅读更多关于 Kafka Streams - Explain the reason why KTable and its associated Store only get updated every 30 seconds

问题 I have this simple KTable definition that generates a Store: KTable<String, JsonNode> table = kStreamBuilder.<String, JsonNode>table(ORDERS_TOPIC, ORDERS_STORE); table.print(); I publish messages into the ORDERS_TOPIC but the store isn't really updated until every 30 seconds. This is the log where there is a message about committing because the 30000ms time has elapsed: 2017-07-25 23:53:15.465 DEBUG 17540 --- [ StreamThread-1] o.a.k.c.consumer.internals.Fetcher : Sending fetch for partitions

How do co-partitioning ensure that partition from 2 different topics end up assigned to the same Kafka Stream Task?

阅读更多关于 How do co-partitioning ensure that partition from 2 different topics end up assigned to the same Kafka Stream Task?

问题 while i understand the pre-requisite of having co-partitioning as explained here Why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams? , I do not understand the mechanism that make sure that the partitions of each topic that have the same key, get assigned to the same KAFKA Stream. I do not see how the consumer group of KAFKA would enable that. The way i understand it is that, we have 2 independent consumer groups, which actually may have

Kafka Stream: KTable materialization

阅读更多关于 Kafka Stream: KTable materialization

问题 How to identify when the KTable materialization to a topic has completed? For e.g. assume KTable has few million rows. Pseudo code below: KTable<String, String> kt = kgroupedStream.groupByKey(..).reduce(..); //Assume this produces few million rows At somepoint in time, I wanted to schedule a thread to invoke the following, that writes to the topic: kt.toStream().to("output_topic_name"); I wanted to ensure all the data is written as part of the above invoke. Also, once the above "to" method is

KafkaProducer sendOffsetsToTransaction need offset+1 to successfully commit current offset

阅读更多关于 KafkaProducer sendOffsetsToTransaction need offset+1 to successfully commit current offset

问题 I'm trying to achieve a transaction in a Kafka Processor to make sure I don't reprocess the same message twice. Given a message (A) I need to create a list of messages that will be produced on another topic in a transaction and i want to commit the original message (A) in the same transaction. From the documentation I found the Producer method sendOffsetsToTransaction which seems to be able to commit an offset in a transaction only if it succeeds. This is the code inside the process() method

Kafka streams tests do not correct work close

阅读更多关于 Kafka streams tests do not correct work close

问题 i have 2 unit test when i run them i have the error below 1) test @Test public void simpleInsertAndOutputEventPrint() throws IOException, URISyntaxException { GenericRecord record = getInitialEvent(); testDriver.pipeInput(recordFactory.create(record)); GenericRecord result = testDriver.readOutput(detailsEventTopic, stringDeserializer, genericAvroSerde.deserializer()).value(); Assert.assertEquals(1,result.get("tt")); } 2) Test @Test public void stateStoreSimpleInsertOutputPrint() {

Kafka Streams Meatadata request only contains internal topics

阅读更多关于 Kafka Streams Meatadata request only contains internal topics

问题 I'm running an Kafka Streams app with version 2.1.0. I found after running for some time, my app (63 nodes) will enter ERROR state one by one. Eventually, all 63 nodes are down. The exception is : ERROR o.a.k.s.p.i.ProcessorStateManager - task [2_2] Failed to flush state store KSTREAM-REDUCE-STATE-STORE-0000000014: org.apache.kafka.streams.errors.StreamsException: task [2_2] Abort sending since an error caught with a previous record (key 110646599468 value InterimMessage [sessionStart

Kafka Stream: Consumer commit frequency

阅读更多关于 Kafka Stream: Consumer commit frequency

问题 With atleast-once guarantee, I understand that there is a possibility of duplicates in case of failures. However, 1) How frequent does the Kafka Stream library performs commits? 2) Does the users ever need consider committing in addition to the above? 3) Is there a best practice on how frequent the commit should be performed? 回答1: Kafka Streams commits in regular intervals that can be configured via parameter commit.interval.ms (default is 30 seconds; if exactly-once processing is enabled,