KafkaAvroSerializer for serializing Avro without schema.registry.url

前端 未结 5 1035
轮回少年
轮回少年 2021-02-02 13:33

I\'m a noob to Kafka and Avro. So i have been trying to get the Producer/Consumer running. So far i have been able to produce and consume simple Bytes and Strings, using the fol

5条回答
  •  悲&欢浪女
    2021-02-02 14:09

    Note first: KafkaAvroSerializer is not provided in vanilla apache kafka - it is provided by Confluent Platform. (https://www.confluent.io/), as part of its open source components (http://docs.confluent.io/current/platform.html#confluent-schema-registry)

    Rapid answer: no, if you use KafkaAvroSerializer, you will need a schema registry. See some samples here: http://docs.confluent.io/current/schema-registry/docs/serializer-formatter.html

    The basic idea with schema registry is that each topic will refer to an avro schema (ie, you will only be able to send data coherent with each other. But a schema can have multiple version, so you still need to identify the schema for each record)

    We don't want to write the schema for everydata like you imply - often, schema is bigger than your data! That would be a waste of time parsing it everytime when reading, and a waste of ressources (network, disk, cpu)

    Instead, a schema registry instance will do a binding avro schema <-> int schemaId and the serializer will then write only this id before the data, after getting it from registry (and caching it for later use).

    So inside kafka, your record will be [ ] (and magic byte for technical reason), which is an overhead of only 5 bytes (to compare to the size of your schema) And when reading, your consumer will find the corresponding schema to the id, and deserializer avro bytes regarding it. You can find way more in confluent doc

    If you really have a use where you want to write the schema for every record, you will need an other serializer (I think writing your own, but it will be easy, just reuse https://github.com/confluentinc/schema-registry/blob/master/avro-serializer/src/main/java/io/confluent/kafka/serializers/AbstractKafkaAvroSerializer.java and remove the schema registry part to replace it with the schema, same for reading). But if you use avro, I would really discourage this - one day a later, you will need to implement something like avro registry to manage versioning

提交回复
热议问题