Topics, partitions and keys

前端 未结 2 1704
梦毁少年i
梦毁少年i 2021-01-30 07:09

I am looking for some clarification on the subject. In Kafka documentations I found the following:

Kafka only provides a total order over messages within a partition, no

2条回答
  •  悲&欢浪女
    2021-01-30 07:52

    Does it mean if i want to have more than 1 consumer (from the same group) reading from one topic I need to have more than 1 partition?

    Let's see the following properties of kafka:

    • each partition is consumed by exactly one consumer in the group
    • one consumer in the group can consume more than one partition
    • the number of consumer processes in a group must be <= number of partitions

    With these properties, kafka is smartly able to provide both ordering guarantees and load balancing over a pool of consumer processes.

    To answer your question, yes, in the context of the same group, if you want to have N consumers, you have to have at least N partitions.

    Does it mean I need same amount of partitions as amount of consumers for the same group?

    I think this has been explained in the first answer.

    How many consumers can read from one partition?

    The number of consumers that can read from one partition is always equal to the number of consumer groups subscribing to that topic.

    Relationship between keys and partitions with regard to API

    First, we must understand that the producer is responsible for choosing which record to assign to which partition within the topic.

    Now, lets see how producer does so. First, lets see the class definition of ProducerRecord.java :

    public class ProducerRecord {
    
        private final String topic;
        private final Integer partition;
        private final Headers headers;
        private final K key;
        private final V value;
        private final Long timestamp;
    
    }
    

    Here, the field that we have to understand from the class is partition.

    From the ProducerRecord docs,

    • If a valid partition number is specified, that partition will be used when sending the record.
    • If no partition is specified but a key is present a partition will be chosen using a hash of the key.
    • If neither key nor partition is present a partition will be assigned in a round-robin fashion.

提交回复
热议问题