background

Another great tool for ensuring high throughput in Kafka is message compression. Like the compressed cookies in the picture above.

Compression is the exchange of space for time. The compression of space increases the speed, that is, the reduction of DISK and network IO through a small amount of CPU consumption.

Message compression model

Message format V1

Kafka does not operate directly on individual messages, but on a collection of messages.

Message format V2:

1, extract the common part of the message and put it into the message set; Removing the common portion of each message reduces the total volume.

2. The CRC check of messages is moved from checking each message to checking the message set, which reduces the checking times and saves CPU.

3. Compress a single message and put it in the body field PK to compress the whole message set for better compression effect;

Compression process model

Comparison of compression algorithms

How to measure the quality of a compression algorithm.

Comparison of common compression algorithms:

The Zstandard algorithm (ZSTD). It is an open source Facebook compression algorithm that provides extremely high compression ratios

Enabling Compression Scenarios

If the CPU load is high, compression is not suitable.

If the bandwidth is insufficient and the CPU load is not high, enable compression to save a lot of bandwidth.

Try to avoid the decompression cost of inconsistent message formats.

summary

The purpose of compression is to occupy less space, bringing transmission speed, but need to consume a certain AMOUNT of CPU;

Is an effective way to improve kafka message throughput.

This section reviews how the new version of Kafka compresses messages and how the compression and decompression process works.

Then, the paper compares four common compression algorithms and selects the appropriate compression algorithm according to the specific application scenarios.

Then the configuration parameters of compression are given, which can be set using compression. Type on both producer and Borker.

Original is not easy, praise attention to support it! Reprint please indicate the source, let us exchange, common progress, welcome communication. I will continue to share Java software programming knowledge and programmer development career path, welcome to pay attention to, I sorted out these years of programming learning all kinds of resources, pay attention to the public account ‘Li Fuchun continuous output’, send ‘learning materials’ to share with you!