Kafka business environment combat series

This series of blogs summarizes and shares cases extracted from real business environments, and gives advice on tuning Kafka business applications and capacity planning for cluster environments. Please continue to follow this series of blogs. This set of Kafka tuning series copyright belongs to the author (Qin Kai Xin) all prohibited reprint, welcome to learn.

Author: Qin Kaixin Address: Shenzhen
  • Kafka business environment combat – Kafka production environment planning
  • Kafka Business Environment in action – Kafka producer and consumer throughput test
  • Kafka business environment combat – Kafka Producer parameter setting and parameter tuning suggestions

1. Producer’s core workflow

  • Producer first uses the user main thread to encapsulate the messages to be sent into an instance of the ProducerRecord class.
  • After serialization, it’s sent to Partioner, which determines the target partition and then to a memory buffer in the Producer program.
  • Another worker thread of Producer (the Sender thread) is responsible for extracting the prepared messages from the buffer in real time into a batch and sending them to the corresponding brokers.

2. Set main parameters of producer

2.1 Setting producer Parameters acks (No data is Lost)

The number of replies to an produce request that the producer needs the leader to confirm before a message is considered “committed.” This parameter is used to control message persistence. Currently, three values are provided:

Acks = 0: indicates that the produce request is returned immediately without waiting for any confirmation from the leader. This scheme has the highest throughput, but there is no guarantee that the message will actually be sent.

Acks = -1: indicates that the partition leader considers the produce request to be successful only after the message is successfully written to all ISR copies (synchronous copies). This scenario provides the highest guarantee of message persistence, but theoretically the worst throughput rate.

Acks = 1: indicates that the leader copy must answer the produce request and write a message to the local log, after which the produce request is considered successful. If the leader replica hangs after answering the request at this point, the message is lost. This is a solution of this kind, providing good persistence and throughput.

Recommended business environment:

Set acks = -1 if you want high persistence and need to lose no data. In other cases, set acks = 1

2.2 The producer parameter buffer.memory setting (throughput)

This parameter specifies the size of the buffer used by the Producer end to cache messages, in bytes. The default value is 33554432. Kafka uses an asynchronous message architecture. Prducer starts up by creating a buffer in memory to hold messages to be sent, and then a dedicated thread reads messages from the buffer to actually send them.

Recommended business environment:

  • As the message continues to send, when the buffer is filled, the producer immediately blocks until free memory is freed. The time limit cannot exceed the value set by max.blocks. Ms. Since Producer is thread-safe, we need to consider bumping buffer.memory if the Producer keeps sending timeoutExceptions.
  • When kafka Producer is shared with multiple threads, it is easy to fill up buffer.memory.

2.3 Setting of producer Parameter Compression. Type (lZ4)

Producer Compressors currently support None, Gzip, SNappy, and LZ4.

Recommended business environment:

Based on the company’s Internet of things platform, the current LZ4 has the best effect. Of course, In August 2016, FaceBook opened source Ztandard. Ztandard compression rate of 2. 8. Snappy is 2.091 and LZ4 is 2.101.

2.4 Producer Parameter Retries Settings (note message out-of-order,EOS)

Set the retry times of producer. When retrying, the producer resends the message that failed for transient reasons. Transient failures may be caused by metadata information failure, insufficient copies, timeout, out-of-bounds displacement, or unknown partitions. If retries > 0 is set, the producer tries again in those cases.

Recommended business environment:

  • The producer has a parameter: Max. In. Flight. Requests.. Per connection. If this parameter is set to about 1, setting retries can cause messages to be sent out of order.
  • Version 0.11.1.0 of Kafka already supports “just-once semantics,” so retries of messages do not result in messages being sent twice.

2.5 Producer Parameter Batch. size Settings (throughput and latency performance)

Each producer sends data according to the batch, so the choice of batch size is critical to the performance of the producer. The producer encapsulates multiple messages sent to the same partition into a batch. When the batch is full, the producer sends the messages. But it doesn’t have to be full, which is related to another parameter lingering.ms. The default value is 16K, and the total value is 16384.

Recommended business environment:

  • The smaller the batch is, the lower the throughput of producer is. The larger the batch is, the larger the throughput is.

2.6 Producer Parameter Linger. Ms Settings (throughput and delay performance)

The producer sends packets according to batch, but the value of Linger. Ms is also determined. The default value is 0, indicating that the producer does not stay. In this case, some batches may not contain enough produce requests and are sent out, resulting in a large number of small batches, which brings great pressure to network I/O.

Recommended business environment:

  • To reduce network IO, improve overall TPS. If linger. Ms =5, the producer request may delay 5ms before being sent.

2.7 producer parameter Max. In. Flight. Requests. Per. The connection Settings (throughput and delay performance)

The maximum number of unanswered produce requests that a producer I/O thread can send on a single Socket connection. Increasing this value should increase the throughput of the I/O threads, thereby improving producer performance overall. However, as mentioned earlier, setting this parameter to greater than 1 can cause messages to be out of order if retry is enabled.

Recommended business environment:

  • The default value of 5 is a good starting point. If producer bottlenecks are found in the I/O thread and the load on the broker side is low, you can increase the value appropriately.
  • Too large an increase in this parameter will result in the overall memory burden of the producer, and may lead to unnecessary lock contention, which in turn reduces TPS

conclusion

Qin Kaixin was born in Shenzhen on October 27, 2018