Consistency problems and Raft consistency algorithm

July 11, 2015

Consistency problem

The consistency algorithm is used to solve the consistency problem, so what is the consistency problem? In distributed systems, consensus problem refers to that for a set of servers, given a set of operations, we need a protocol to make their final results agree. More specifically, when one server receives a set of instructions from a client, it must communicate with the other servers to ensure that all servers receive the same instructions in the same order, so that all servers produce consistent results and look like one machine.

The consistency algorithm in actual production needs to have the following attributes:

  • Safety: that no wrong result is returned anyway
  • Available: As long as most of the machines are in order, they will still be available. A cluster of five machines, for example, allows a maximum of two machines to fail.
  • There is no time dependence to ensure consistency, that is, the system is asynchronous.
  • In general, the running time is determined by the majority of machines, and the overall efficiency will not be affected by a small number of slow machines.

Why address consistency?

We can say that a distributed system has 99.99 reliability… %, but you can’t say it reaches 100%. Why? Because the problem of consistency cannot be solved completely. The following four problems in distributed systems are related to consistency issues:

  1. Reliable multicast
  2. Membership Protocal (Failuer Detector) Manages members in a cluster
  3. Leader Election Algorithm
  4. Mutual exclution, such as the exclusive or allocation of resources

Raft consistency algorithm

Earlier, I introduced some election algorithms from textbooks, which are also consistent algorithms, that is, the leader is considered consistent by all servers in the end. At present, there are two mainstream consistency algorithms in practical application: Paxos and Raft. Zookeeper is the Paxos chosen, while ETCD uses Raft. As a Go enthusiast, I’ll start with Raft.

Raft was introduced because Paxos is too difficult to understand and implement, and the goal is to be as easy to understand as possible without being as reliable as Paxos. But Raft’s 18-page paper In Search of an Understandable Consensus Algorithm is easier for me to understand.

Raft breaks the consistency problem down into three small questions:

  1. Leader election election
  2. Log replication Log replication and synchronization
  3. Safety security

The basic concept

Each Server has three states: Leader, Follower, and candidate

  • Followers: do not send requests but only reply to the leader and candidate’s requests.
  • Leader: processes the requests sent by the client
  • Candidate: Leader candidate

Raft divides time into terms. An election is held at the beginning of each term. There is at most one leader or no leader in each term.

RPC implementation

RequestVote RPC: The RequestVote RPC is initiated by the candidates during the election process. The RequestVote RPC is executed by the RequestVote RPC. Only when the other party’s term and log are at least as new as his or her own, the candidate receiving the majority of votes will be elected leader.

AppendEntries RPC is initiated by the leader to distribute logs, forcing Follwer’s log to match its own.

Leader election

If a follower does not receive a message from the leader after election timeout, it enters a new term, becomes a candidate, votes for itself, and initiates a RequestVote RPC. This state continues until any of the following three events occur:

  1. It won the election
  2. Another Server wins the election
  3. One term passed, still no election results

Why is there a situation of 3? If everyone initiates an election at the same time and votes for themselves, then no Server can get the majority of votes. At this time, it is necessary to enter the next term and elect again. To prevent this from happening continuously, the election time of each Server is randomly set to a different value, so that the first timeout can initiate the next election.

Log replication

Once the leader has been selected, the log can be distributed.

Each log has a log index and term number. When most followers have copied the log, it is said to be committed and ready to execute. The Leader remembers the maximum log index that has been committed and uses it to distribute the next AppendEntries RPC. This is the same as the TCP segment number.

When a leader is reelected, its logs and the followers’ logs may be different, so it forces all the followers to match its own logs. First, the leader finds the maximum number of logs that match the number of followers and overwrites all subsequent logs.

Safety

But so far there is no guarantee of safety. For example, if a follower drops out while the leader is committing the log, and the follower is later selected as the leader, it overwrites the committed logs currently in Follwer. Since these logs are already executed, the resulting machines execute different instructions. This can be prevented by adding an additional restriction to the electoral process, namely:

2. For any term, the Leader will contain the logs that were committed in the previous term.Copy the code

This is the complete Raft algorithm.

An Understandable Consensus Algorithm