apache-kafka-streams

Spring @StreamListener process(KStream<?,?> stream) Partition

拟墨画扇 提交于 2019-12-24 20:14:23
问题 I have a topic with multiple partitions in my stream processor i just wanted to stream that from one partition, and could nto figure out how to configure this spring.cloud.stream.kafka.streams.bindings.input.consumer.application-id=s-processor spring.cloud.stream.bindings.input.destination=uinput spring.cloud.stream.bindings.input.group=r-processor spring.cloud.stream.bindings.input.contentType=application/java-serialized-object spring.cloud.stream.bindings.input.consumer.header-mode=raw

Is KSQL making remote requests under the hood, or is a Table actually a global KTable?

时光总嘲笑我的痴心妄想 提交于 2019-12-24 18:28:24
问题 I have a Kafka topic containing customer records, called "customer-created". Each customer is a new record in the topic. There are 4 partitions. I have two ksql-server instances running, based on the docker image confluentinc/cp-ksql-server:5.3.0 . Both use the same KSQL Service Id. I've created a table: CREATE TABLE t_customer (id VARCHAR, firstname VARCHAR, lastname VARCHAR) WITH (KAFKA_TOPIC = 'customer-created', VALUE_FORMAT='JSON', KEY = 'id'); I'm new to KSQL, but my understanding was

Kafka Streams for count a total num?

蹲街弑〆低调 提交于 2019-12-24 18:16:09
问题 A topic named "addcash" which has 3 partitions(the number of the kafka cluster machines is 3 too), and a lot of user recharge messages flow in it. I want to count the total money num everyday. I learned from some articles about Kafka Streams: The Kafka Streams will run the topology as task, and the number of the task depend on the number of the topic's partitions, and every task has individual state store. So when I count the total money num by state stroe, Is there three values, not a total

Kafka Streams - Explain the reason why KTable and its associated Store only get updated every 30 seconds

て烟熏妆下的殇ゞ 提交于 2019-12-24 10:13:53
问题 I have this simple KTable definition that generates a Store: KTable<String, JsonNode> table = kStreamBuilder.<String, JsonNode>table(ORDERS_TOPIC, ORDERS_STORE); table.print(); I publish messages into the ORDERS_TOPIC but the store isn't really updated until every 30 seconds. This is the log where there is a message about committing because the 30000ms time has elapsed: 2017-07-25 23:53:15.465 DEBUG 17540 --- [ StreamThread-1] o.a.k.c.consumer.internals.Fetcher : Sending fetch for partitions

How do co-partitioning ensure that partition from 2 different topics end up assigned to the same Kafka Stream Task?

匆匆过客 提交于 2019-12-24 10:03:23
问题 while i understand the pre-requisite of having co-partitioning as explained here Why does co-partitioning of two Kstreams in kafka require same number of partitions for both the streams? , I do not understand the mechanism that make sure that the partitions of each topic that have the same key, get assigned to the same KAFKA Stream. I do not see how the consumer group of KAFKA would enable that. The way i understand it is that, we have 2 independent consumer groups, which actually may have

Kafka Stream: KTable materialization

纵然是瞬间 提交于 2019-12-24 09:58:29
问题 How to identify when the KTable materialization to a topic has completed? For e.g. assume KTable has few million rows. Pseudo code below: KTable<String, String> kt = kgroupedStream.groupByKey(..).reduce(..); //Assume this produces few million rows At somepoint in time, I wanted to schedule a thread to invoke the following, that writes to the topic: kt.toStream().to("output_topic_name"); I wanted to ensure all the data is written as part of the above invoke. Also, once the above "to" method is

KafkaProducer sendOffsetsToTransaction need offset+1 to successfully commit current offset

霸气de小男生 提交于 2019-12-24 09:31:23
问题 I'm trying to achieve a transaction in a Kafka Processor to make sure I don't reprocess the same message twice. Given a message (A) I need to create a list of messages that will be produced on another topic in a transaction and i want to commit the original message (A) in the same transaction. From the documentation I found the Producer method sendOffsetsToTransaction which seems to be able to commit an offset in a transaction only if it succeeds. This is the code inside the process() method

Kafka streams tests do not correct work close

谁说胖子不能爱 提交于 2019-12-24 08:14:27
问题 i have 2 unit test when i run them i have the error below 1) test @Test public void simpleInsertAndOutputEventPrint() throws IOException, URISyntaxException { GenericRecord record = getInitialEvent(); testDriver.pipeInput(recordFactory.create(record)); GenericRecord result = testDriver.readOutput(detailsEventTopic, stringDeserializer, genericAvroSerde.deserializer()).value(); Assert.assertEquals(1,result.get("tt")); } 2) Test @Test public void stateStoreSimpleInsertOutputPrint() {

Kafka Streams Meatadata request only contains internal topics

和自甴很熟 提交于 2019-12-24 08:12:02
问题 I'm running an Kafka Streams app with version 2.1.0. I found after running for some time, my app (63 nodes) will enter ERROR state one by one. Eventually, all 63 nodes are down. The exception is : ERROR o.a.k.s.p.i.ProcessorStateManager - task [2_2] Failed to flush state store KSTREAM-REDUCE-STATE-STORE-0000000014: org.apache.kafka.streams.errors.StreamsException: task [2_2] Abort sending since an error caught with a previous record (key 110646599468 value InterimMessage [sessionStart

Kafka Stream: Consumer commit frequency

为君一笑 提交于 2019-12-24 07:28:03
问题 With atleast-once guarantee, I understand that there is a possibility of duplicates in case of failures. However, 1) How frequent does the Kafka Stream library performs commits? 2) Does the users ever need consider committing in addition to the above? 3) Is there a best practice on how frequent the commit should be performed? 回答1: Kafka Streams commits in regular intervals that can be configured via parameter commit.interval.ms (default is 30 seconds; if exactly-once processing is enabled,