How to discover and filter out duplicate records in Kafka Streams
问题 Say you have a topic with a null key and the value is {id:1, name:Chris, age:99} Lets say you want to count up the number of people by name. You would do something like below: nameStream.groupBy((key,value) -> value.getName()) .count(); Now lets says it is valid you can get duplicate records and you can tell it is a duplicate based on the id. For example: {id:1, name:Chris, age:99} {id:1, name:Chris, age:xx} Should result in a count of one and {id:1, name:Chris, age:99} {id:2, name:Chris, age