This article has participated in the third “topic writing” track at the Diggings Creators Camp. For more details, check out diggings program | Creators Camp

Let’s review the related concepts:

  1. A copy of the

    • A copy is a copy of a particular partition relative to a partition
    • A partition contains multiple copies, which are distributed among different brokers
    • One of the copies is the leader copy, and the others are the follower copies
    • The leader copy pair provides read and write services. The follower copy pair provides synchronous backup and does not provide external services
  2. AR/ISR

    • All copies in a partition are collectively called AR
    • The replica set that is synchronized with the leader replica → ISR flicker[leader, sync followers]
  3. LEO/HW

    • LEO identifies the point at which the next message is written in each partition → Each partition copy has its own LEO
    • The smallest LEO factor in ISR, HW, commonly known as high water, consumers can only pull the code before HW

What happens to producer Send MSG?

– > send (MSG)

→ Write to the leader copy of the partition until all followers in the ISR have been synchronized, and then update the HW of the partition

→ Consumers can start to consume this MSG

LEO and HW

The first and most important thing for partitioned copies is message write/consumption.

Consumption write → LEO, each partition copy has a LEO; Follower and leader synchronization LEO is related to the ISR

Consumption consumption → HW, according to each copy LEO to determine the leader HW (only the leader has read and write services, all the leader HW has the opportunity to be consumed by the consumer)

Here’s a summary of the leader/follower LEO and HW update principles:

Otherwise, don’t know how to update, say more will be confused ~ ~ ~

LEO update

Leader:

  • Producer Send MSG → Leader writes to the local disk, LEO+1

leader remote:

  • The follower fetch request → leader process occurred. Procedure
  • The fetch req will carry the MSG tail offset from which the followers pull offset = remote LEO

follower:

  • The follower fetch request → leader process occurred. Procedure
  • Fetch to follower, then write to local disk, LEO+1

HW update

leader:

  • After the LEADER updates LEO
  • Tail Min {leader LEO, sync follower remote LEO… }

follower:

  • Fetch req → follower Receivedresp {msg, leader HW}, after updating the local LEO
  • **follower HW = min{follower LEO, remote leader HW}**

Under the HW update mechanism, it takes two rounds to complete the update:

  1. Follower fetch MSG → Obtain a new MSG and update the local MSG, LEO+1

  2. Follower fetch req → Fetch the next MSG and send the current LEO

    • Leader Update remote follower LEO tail Leader Update HW
    • Leader send resp{leader HW} tail follower updater HW

In this way, LEO/HW on both sides are updated

Mechanism of Epoch

The LEADER epoch does not change the overall operation of the HW mechanism. It only improves log truncation after the leader/follower restart and recovery

Why is read/write separation not supported

Consider a component that supports read-write separation: mysql.

There is a delay window during the synchronization process of primary and secondary mysql data. In this window, the data on primary and secondary nodes is inconsistent.

If Kafka is in master/slave synchronization mode, the following requirements are required:

Net I/O → leader MEm → Leader disk → NET I/O → follower MEm → follower disk

From the network → disk → network, the whole process delay is large, and Kafka real-time requirements of this component, is not appropriate.

The main reason for this is that when I read sync MSG → I can’t read MSG immediately in the copy. This is the same as when I read what you write

During replication synchronization, if replication read is allowed → it is likely to cause A delay in data synchronization between replicas. If the leader is being synchronized in B, the MSG is sometimes readable and sometimes unreadable.

Conclusion:

  1. Time window in which data inconsistency occurs during synchronization
  2. High synchronization delay
  3. Facilitates read-what-you-write
  4. Easy to implement Monotonic Reads

However, although the replicas are not readable, their responsibility is to synchronize the leader data, which is guaranteed. Their action is to asynchronously pull the leader copy.

Since it is asynchronous, the data of the leader and the followers must be inconsistent at some point in time (another proof of why the followers are unreadable). How should we define the criteria for synchronization between the leader and the followers?

If the server is aware of the standard, it will consider that the MSG has been backed up between replicas and will not lose data (of course, if the server is not storing any disks, the data will be lost. “Lost” here refers to the normal situation)

ISR → In-sync Replicas “ISR” is a replica that is synchronized with the Leader