Purge Kafka Topic

泄露秘密 提交于 2019-11-26 21:10:06
steven appleyard

Temporarily update the retention time on the topic to one second:

kafka-topics.sh --zookeeper <zkhost>:2181 --alter --topic <topic name> --config retention.ms=1000

And in newer Kafka releases, you can also do it with kafka-configs --entity-type topics

kafka-configs.sh --zookeeper <zkhost>:2181 --entity-type topics --alter --entity-name <topic name> --add-config retention.ms=1000

then wait for the purge to take effect (about one minute). Once purged, restore the previous retention.ms value.

rjaiswal

To purge the queue you can delete the topic:

bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic test

then re-create it:

bin/kafka-topics.sh --create --zookeeper localhost:2181 \
    --replication-factor 1 --partitions 1 --topic test
Thomas Bratt

Here are the steps I follow to delete a topic named MyTopic:

  1. Describe the topic, and take not of the broker ids
  2. Stop the Apache Kafka daemon for each broker ID listed.
  3. Connect to each broker, and delete the topic data folder, e.g. rm -rf /tmp/kafka-logs/MyTopic-0. Repeat for other partitions, and all replicas
  4. Delete the topic metadata: zkCli.sh then rmr /brokers/MyTopic
  5. Start the Apache Kafka daemon for each stopped machine

If you miss you step 3, then Apache Kafka will continue to report the topic as present (for example when if you run kafka-list-topic.sh).

Tested with Apache Kafka 0.8.0.

While the accepted answer is correct, that method has been deprecated. Topic configuration should now be done via kafka-configs.

kafka-configs --zookeeper localhost:2181 --entity-type topics --alter --add-config retention.ms=1000 --entity-name MyTopic

Configurations set via this method can be displayed with the command

kafka-configs --zookeeper localhost:2181 --entity-type topics --describe --entity-name MyTopic

Tested in Kafka 0.8.2, for the quick-start example: First, Add one line to server.properties file under config folder:

delete.topic.enable=true

then, you can run this command:

bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic test

From kafka 1.1

Purge a topic

bin/kafka-configs.sh --zookeeper localhost:2181 --alter --entity-type topics ->-entity-name tp_binance_kline --add-config retention.ms=100

wait 1 minute, to be secure that kafka purge the topic remove the configuration, and then go to default value

bin/kafka-configs.sh --zookeeper localhost:2181 --alter --entity-type topics -> -entity-name tp_binance_kline --delete-config retention.ms

Sometimes, if you've a saturated cluster (too many partitions, or using encrypted topic data, or using SSL, or the controller is on a bad node, or the connection is flaky, it'll take a long time to purge said topic.

I follow these steps, particularly if you're using Avro.

1: Run with kafka tools :

bash kafka-configs.sh --alter --entity-type topics --zookeeper zookeeper01.kafka.com --add-config retention.ms=1 --entity-name <topic-name>

2: Run on Schema registry node:

kafka-avro-console-consumer --consumer-property security.protocol=SSL --consumer-property ssl.truststore.location=/etc/schema-registry/secrets/trust.jks --consumer-property ssl.truststore.password=password --consumer-property ssl.keystore.location=/etc/schema-registry/secrets/identity.jks --consumer-property ssl.keystore.password=password --consumer-property ssl.key.password=password --bootstrap-server broker01.kafka.com:9092 --topic <topic-name> --new-consumer --from-beginning

3: Set topic retention back to the original setting, once topic is empty.

bash kafka-configs.sh --alter --entity-type topics --zookeeper zookeeper01.kafka.com --add-config retention.ms=604800000 --entity-name <topic-name>

Hope this helps someone, as it isn't easily advertised.

UPDATE: This answer is relevant for Kafka 0.6. For Kafka 0.8 and later see answer by @Patrick.

Yes, stop kafka and manually delete all files from corresponding subdirectory (it's easy to find it in kafka data directory). After kafka restart the topic will be empty.

kafka don't have direct method for purge/clean-up topic (Queues), but can do this via deleting that topic and recreate it.

first of make sure sever.properties file has and if not add delete.topic.enable=true

then, Delete topic bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic myTopic

then create it again.

bin/kafka-topics.sh --zookeeper localhost:2181 --create --topic myTopic --partitions 10 --replication-factor 2

The simplest approach is to set the date of the individual log files to be older than the retention period. Then the broker should clean them up and remove them for you within a few seconds. This offers several advantages:

  1. No need to bring down brokers, it's a runtime operation.
  2. Avoids the possibility of invalid offset exceptions (more on that below).

In my experience with Kafka 0.7.x, removing the log files and restarting the broker could lead to invalid offset exceptions for certain consumers. This would happen because the broker restarts the offsets at zero (in the absence of any existing log files), and a consumer that was previously consuming from the topic would reconnect to request a specific [once valid] offset. If this offset happens to fall outside the bounds of the new topic logs, then no harm and the consumer resumes at either the beginning or the end. But, if the offset falls within the bounds of the new topic logs, the broker attempts to fetch the message set but fails because the offset doesn't align to an actual message.

This could be mitigated by also clearing the consumer offsets in zookeeper for that topic. But if you don't need a virgin topic and just want to remove the existing contents, then simply 'touch'-ing a few topic logs is far easier and more reliable, than stopping brokers, deleting topic logs, and clearing certain zookeeper nodes.

Thomas' advice is great but unfortunately zkCli in old versions of Zookeeper (for example 3.3.6) do not seem to support rmr. For example compare the command line implementation in modern Zookeeper with version 3.3.

If you are faced with an old version of Zookeeper one solution is to use a client library such as zc.zk for Python. For people not familiar with Python you need to install it using pip or easy_install. Then start a Python shell (python) and you can do:

import zc.zk
zk = zc.zk.ZooKeeper('localhost:2181')
zk.delete_recursive('brokers/MyTopic') 

or even

zk.delete_recursive('brokers')

if you want to remove all the topics from Kafka.

user4713340

To clean up all the messages from a particular topic using your application group (GroupName should be same as application kafka group name).

./kafka-path/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic topicName --from-beginning --group application-group

Could not add as comment because of size: Not sure if this is true, besides updating retention.ms and retention.bytes, but I noticed topic cleanup policy should be "delete" (default), if "compact", it is going to hold on to messages longer, i.e., if it is "compact", you have to specify delete.retention.ms also.

./bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-name test-topic-3-100 --entity-type topics
Configs for topics:test-topic-3-100 are retention.ms=1000,delete.retention.ms=10000,cleanup.policy=delete,retention.bytes=1

Also had to monitor earliest/latest offsets should be same to confirm this successfully happened, can also check the du -h /tmp/kafka-logs/test-topic-3-100-*

./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list "BROKER:9095" --topic test-topic-3-100 --time -1 | awk -F ":" '{sum += $3} END {print sum}' 26599762

./bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list "BROKER:9095" --topic test-topic-3-100 --time -2 | awk -F ":" '{sum += $3} END {print sum}' 26599762

The other problem is, you have to get current config first so you remember to revert after deletion is successful: ./bin/kafka-configs.sh --zookeeper localhost:2181 --describe --entity-name test-topic-3-100 --entity-type topics

Another, rather manual, approach for purging a topic is:

in the brokers:

  1. stop kafka broker
    sudo service kafka stop
  2. delete all partition log files (should be done on all brokers)
    sudo rm -R /kafka-storage/kafka-logs/<some_topic_name>-*

in zookeeper:

  1. run zookeeper command line interface
    sudo /usr/lib/zookeeper/bin/zkCli.sh
  2. use zkCli to remove the topic metadata
    rmr /brokers/topic/<some_topic_name>

in the brokers again:

  1. restart broker service
    sudo service kafka start
tushararora19
./kafka-topics.sh --describe --zookeeper zkHost:2181 --topic myTopic

This should give retention.ms configured. Then you can use above alter command to change to 1second (and later revert back to default).

Topic:myTopic   PartitionCount:6        ReplicationFactor:1     Configs:retention.ms=86400000

From Java, using the new AdminZkClient instead of the deprecated AdminUtils:

  public void reset() {
    try (KafkaZkClient zkClient = KafkaZkClient.apply("localhost:2181", false, 200_000,
        5000, 10, Time.SYSTEM, "metricGroup", "metricType")) {

      for (Map.Entry<String, List<PartitionInfo>> entry : listTopics().entrySet()) {
        deleteTopic(entry.getKey(), zkClient);
      }
    }
  }

  private void deleteTopic(String topic, KafkaZkClient zkClient) {

    // skip Kafka internal topic
    if (topic.startsWith("__")) {
      return;
    }

    System.out.println("Resetting Topic: " + topic);
    AdminZkClient adminZkClient = new AdminZkClient(zkClient);
    adminZkClient.deleteTopic(topic);

    // deletions are not instantaneous
    boolean success = false;
    int maxMs = 5_000;
    while (maxMs > 0 && !success) {
      try {
        maxMs -= 100;
        adminZkClient.createTopic(topic, 1, 1, new Properties(), null);
        success = true;
      } catch (TopicExistsException ignored) {
      }
    }

    if (!success) {
      Assert.fail("failed to create " + topic);
    }
  }

  private Map<String, List<PartitionInfo>> listTopics() {
    Properties props = new Properties();
    props.put("bootstrap.servers", kafkaContainer.getBootstrapServers());
    props.put("group.id", "test-container-consumer-group");
    props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
    props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");

    KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
    Map<String, List<PartitionInfo>> topics = consumer.listTopics();
    consumer.close();

    return topics;
  }

Following @steven appleyard answer I executed the following commands on Kafka 2.2.0 and they worked for me.

bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name <topic-name> --describe

bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name <topic-name> --alter --add-config retention.ms=1000

bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type topics --entity-name <topic-name> --alter --delete-config retention.ms
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!