1. What is ZooKeeper?

ZooKeeper is a subproject of Hadoop, an open source distributed service framework. ZooKeeper implements functions such as data publishing/subscription, unified naming service, distributed coordination/notification, configuration management, distributed lock, and distributed queue. Generally speaking, ZooKeeper is a database that supports adding, deleting, and modifying files, and assigns tasks to nodes according to rules. Zookeeper implements file storage and notification callback functions. Its data structure is similar to that of a standard file system. In comparison, each node of file system ZK can store data, but its size is limited to 1 MB. Usually, when using Dubbo, we suggest using ZooKeeper as the registry. We can also use Redis and Eureka as the registry. Of course, I have only used ZooKeeper.

Zk data structureZk service configuration file

As mentioned above, zK is a database and its data is stored in dataDir. The above configuration is a cluster configuration with server1, Server2,server3. Here is A pseudo-cluster (three servers are started on the same machine). We can see that localhost:A:B, where LicalHost is our service IP,A is the port specially used for election, B is the communication port of cluster, and clientPort is the port providing services to clients.

Noun explanation:

** Data publish/subscribe: ** Register a data change Watcher on the service node when initializing the node. When changing the node, the data will be notified to the client, and the client will re-read the changed data after receiving the change notification.

** Unified naming service: ** to obtain the unique name of the global, but also can use the zNode sequence node feature generated nodes will return the sequence number, in accordance with the given name, generate a unified name with special meaning, all clients can create the same name of different sequence nodes.

2. The role of the server? And the state

A server has three roles: Leader, Follower, and Observer. The Leader dispatches services in a cluster to ensure the sequence of transaction processing. Followers vote on the Proposal, vote in the Leader election, process non-transaction requests from the client, and forward transaction requests (adding, deleting, and changing data) to the Leader server. The Observer does not participate in voting and improves the non-transaction processing capability of the cluster on the basis of not participating in the cluster transaction capability.

The server states are LOOKING(thinking that the server in the group does not have the status of the Leader LOOKING for the Leader), FOLLOWING(the server role is the status of followers), and LEADING(the server role is Leader), OBSERVING(the role of the server is an Observer).

The Leader election occurs when the node Leader hangs up, and when the cluster server is started, the Leader finds that no more than half of the followers follow after the followers hang up. These three situations will trigger the Leader election.

3. How does ZooKeeper solve the data consistency problem?

What is the startup process of the ZooKeeper Server?

To understand how ZooKeepr solves data consistency, ZooKeeper actually wants to achieve strong consistency, but ultimately achieve final consistency. First, let’s understand what is CAP? ZK follows the CP principle, sacrificing usability to meet strong consistency. As shown in the following figure, after the data of database A is changed to 2, 1 cannot be read in Step 2. In this case, the synchronization between databases must be very fast or A lock must be added in Step 2 to wait for the data synchronization to complete before reading the result.

Examples of strong consistency

I’m using version 3.6.1 on Git. Go to zkserver.sh

Find the daemon startup script

Found in the parameters ZOOMAIN = “org. Apache. The zookeeper. Server. Quorum. QuorumPeerMain” corresponds to this class is you see the entrance to the source service.

There is an initialization method in the entry main method, main.InitializeAndRun (args); After this method goes into the graph is the method that goes into cluster mode marked in red, so let’s look at this method.

The cluster mode is determined

When you enter the method, you see a bunch of sets, read the configuration file value into the object QuorumPeer, and then the object start, which calls the election method at startup.

Why does ZooKeeper choose odd servers?

This starts from the half mechanism of ZooKeeper. If only 2 machines in the cluster are allowed to break down in 6 machines, and 2 machines in 5 machines are allowed to break down, it is recommended to choose an odd number of servers from the perspective of resource utilization.

The one marked in red is the // voting method. By default, more than half of the votes are approved

The leader election method is marked in red

The default electionAlgorithm is 3

Vote under the case looking condition of lookForLeader method in the FastLeaderElection class. private boolean totalOrderPredicate(long newId, long newZxid, long newEpoch, long curId, long curZxid, Long curEpoch (long curEpoch) compares the votes received from the other party with his own current vote to determine whether the other party’s vote is better than his own vote.

totalOrderPredicate

As long as the current Server state is LOOKING, it enters a loop, constantly reading notifications from other servers, comparing, updating its own poll, sending its own poll, and counting the vote results until the leader elects or exits in error.

Electoral proportion parameter

①Serverid: Serverid for example, if there are three servers, their ids are 1,2, and 3 respectively. The larger the number, the greater the weight in the selection algorithm.

(2) Zxid: indicates the transaction log ID. Each transaction request will generate a transaction log. The maximum data ID stored in the server. The larger the value is, the more new the data is, and the more new the data is, the greater the weight is in the election algorithm.

③Epoch: The logical clock, or the number of votes, is the same in the same round of voting. This number increases with each vote and is then compared to the value received from other servers in the poll information

The cluster starts the voting process

① Each Server will issue a vote, so Server1, Server2 and Server3 will vote themselves as the Leader Server. Each vote contains the most basic elements: The myid and zxid of the selected server are represented as (myid,zxid), that is, Server1 votes for (1,0) and Server2 votes for (2,0), and each sends this vote to all the other machines in the cluster.

② Receive the votes from each server and judge the validity of the votes, including checking whether the votes are in this round and whether they are from the LOOKING server.

③ PK vote: After receiving votes from other servers, for each vote, the server needs to pk others’ votes with its own vote:

  1. Check the zxID first. The server with a larger ZXID acts as the Leader first.

  2. If the zxids are the same, then the myID is compared and the server with the larger MYID acts as the Leader server. As a result, Server1{(2,0),(2,0)},Server2{(2,0),2,0)} will vote for Server2, so Server3 will directly vote for Sever2, and finally determine the Leader.

Author: Wang Qiaomin, Creditease Institute of Technology