I need to deduplicate a kafka stream of messages by similarity in a rolling fashion. We can assume that only messages within 1 day will possibly be duplicates. The current strat