Kafka is fast for a number of reasons. To name a few.
- Zero Copy - See https://en.wikipedia.org/wiki/Zero-copy basically it
calls the OS kernal direct rather than at the application layer to move
data fast.
- Batch Data in Chunks - Kafka is all about batching the
data into chunks. This minimises cross machine latency with all the
buffering/copying that accompanies this.
- Avoids Random Disk Access - as Kafka is an immutable commit log it does not need to rewind the disk and do many random I/O operations and can just access the disk in a sequential manner. This enables it to get similar speeds from a
physical disk compared with memory.
- Can Scale Horizontally - The
ability to have thousands of partitions for a single topic spread
among thousands of machines means Kafka can handle huge loads.