问题
A topic named "addcash" which has 3 partitions(the number of the kafka cluster machines is 3 too), and a lot of user recharge messages flow in it. I want to count the total money num everyday. I learned from some articles about Kafka Streams: The Kafka Streams will run the topology as task, and the number of the task depend on the number of the topic's partitions, and every task has individual state store. So when I count the total money num by state stroe, Is there three values, not a total value will be return? What is the right way to do it? Thanks!
回答1:
That is correct.
You have two ways to do this:
You do the partial sums, and that a follow up
KTable.groupBy(...).reduce(...)
and set a single global key to bring all partial aggregates together.You can get the total sum by creating an additional single-partitions topic, write the partial results into this topic, read the data back with KafkaStreams and do a second aggregation that add those partial numbers together. You can express this with a single program using
through("my-single-partition-topic");
to connect the first and second part of the aggregation. You would need to use atransform()
but not DSL to do the second aggregation step for this solution.
来源:https://stackoverflow.com/questions/48218885/kafka-streams-for-count-a-total-num