I have a spring application that is my kafka producer and I was wondering why avro is the best way to go. I read about it and all it has to offer, but why can\'t I just seriali
First of all - Kafka has no idea about the key/value content. It operates bytes and it's client (producer/consumer) responsibility to cake care of de/serialization.
The most common options so far seem to be JSON, protobuf and Avro.
What I personally like with Avro and why I usually use it and recommend to others:
1) It's a enough compact binary serialization, with a schema and logical types (which help distinguish just a regular long
from timestamp in long millis
)
2) the Avro schemas are very descriptive and perfectly documented
3) wide support among most of widely-used programming languages is a must!
4) Confluent (and others) provide a repository for schemas, a so-called "schema registry", to have a centralized storage for your schemas. In Avro, the message contains just the schema version ID, not the schema itself.
5) If you are using Java, you can have great benefits from using the POJO base class generation from the schema.
Sure you can have parts of these with other options. You should try and compare all the options that suite your use-case.
P.S. My very personal opinionated advice is: if it's not a String
, go for Avro. Applies both for keys and values.