This is the 18th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

What is a consumer group

Consumer groups are one of the most important features of Kafka. In short, a Consumer Group consists of several Consumer instances. Common consumption Common consumption of messages in different partitions of the same or multiple topics, note that each Partition can only be consumed by one Consumer in the Consumer group.

Multiple consumer groups are independent of each other and can subscribe to the same topics for consumption.

How does a consumer group work

When a consumer group subscribes to one or more topics, all partitions in those topics are “assigned” to consumer instances in the consumer group, which each process messages for the Partition to which it is assigned.

As mentioned earlier, a partition can only be assigned to one consumer instance and cannot be consumed by multiple instances at the same time, and an instance may be assigned multiple partitions.

Let me give you a specific example. Suppose a consumer group subscribed to THREE topics, T1, T2, and T3, with three, five, and ten partitions respectively, for a total of 18 partitions. At this time:

  • If there are exactly 18 instances in the consumer group, then each instance is responsible for consuming messages for one partition.
  • If the number of instances in the consumer group exceeds the total number of partitions, such as 20, then two instances will be idle and do nothing until some other instance fails to trigger the Rebalance.
  • If there are six instances in the consumer group, each instance will be assigned three partitions.

Therefore, the number of instances in the recommendation consumer group is the same as the total number of partitions for the subscribed Topic.

Rebalance

The previous section mentioned the Rebalance mechanism, which is Kafka’s mechanism for assigning message partitions to consumer instances in consumer groups. In the example above, the process of allocating 18 partitions to six consumer instances is Rebalance. In addition to the distribution required after a consumer group subscribes to a Topic, there are several situations that trigger Relalance:

  1. When instances in a consumer group change, including but not limited to individual consumer instances going down, new instances joining, instances being kicked out, etc. This is when the consumer instances corresponding to some partitions are no longer in the group, or when a new instance can “share” the work of another instance. Rebalance is triggered.
  2. When the subscribed topic changes. Because consumer groups can subscribe to topics by matching expressions, for example, subscribe to all of theabcThe opening theme. When a new theme is created, if the name of the theme isabcIn this case, the relationship between the partition and the consumer instance needs to be reassigned.
  3. When the partitioning of a topic changes, whether the number of partitions increases or decreases, consumer instances need to be reassigned to achieve equilibrium.

The Rebalance mechanism can be automated and as balanced as possible when all of these factors are present, but it still has some problems.

First, when Relalance is executed, all consumer instances in the consumer group stop consuming messages until the Rebalance is complete. This is similar to the stop-the-world that is often mentioned in the garbage collector, so avoid making this Rebalance as often as possible.

The Rebalance is’ unintelligent ‘. Each Rebalance reallocations all the partitions, rather than making the changes as small as possible. This makes making the Rebalance a time-consuming operation when there are too many partitions and consumer instances. So again, avoid making this Rebalance as often as possible.