问题
I come across the following two phrases from the book "Mastering Kafka Streams and ksqlDB" and author used two terms, what does they really mean "compacted topics" and "uncompacted topics"
Does they got anything to with respect to "log compaction" ?
Tables can be thought of as updates to a database. In this view of the logs, only the current state (either the latest record for a given key or some kind of aggregation) for each key is retained. Tables are usually built from compacted topics.
Streams can be thought of as inserts in database parlance. Each distinct record remains in this view of the log. Streams are usually built from uncompacted topics.
回答1:
Yes, log compaction
according to kafka docs
Log compaction ensures that Kafka will always retain at least the last known value for each message key within the log of data for a single topic partition
https://kafka.apache.org/documentation/#compaction
If log compaction is enabled on topic, Kafka removes any old records when there is a newer version of it with the same key in the partition log.
For more detailed explanation of log compaction refer - https://medium.com/swlh/introduction-to-topic-log-compaction-in-apache-kafka-3e4d4afd2262
回答2:
Yes, these terms are synonymous.
Ref: Log Compaction
回答3:
From this article:
The idea behind compacted topics is that no duplicate keys exist. Only the most recent value for a message key is maintained.
It is mostly used for the scenarios such as restoring to the previous state before the application crashed or system failed, or while reloading cache after application restarts.
As an example of the above kafka has the topic __consumer_offsets
which can be used to to continue from the last message which was read after a crash or a restart. A schema registry is also often used to ensure compatible communication between producers and consumers. The schemas used are maintained in the __schemas
topic.
来源:https://stackoverflow.com/questions/63009907/kafka-uncompacted-topics-vs-compacted-topics