What is Zookeeper?

Zookeeper is a distributed coordination framework and a subproject of Apache Hadoop. It is used to solve data management problems frequently encountered in distributed applications, such as unified naming service, status synchronization service, cluster management, and configuration item management of distributed applications.

Zookeeper provides two basic features

The data structure

Zookeeper maintains a data structure similar to a Linux file system.



“/” can be understood as the root directory, and its subdirectory entries are called znodes (directory nodes). Similar to a file system, we can add and remove ZNodes freely, and add and remove subZNodes within a ZNode.

Zookeeper provides six node types:

  1. Persistent-persistent directory nodes
    • Once a directory node is created, it exists forever as long as it is not manually removed. If the client is disconnected from ZooKeeper, the node will not be deleted.
  2. PERSISTENT_SEQUENTIAL – persistent Sequentially numbered directory nodes
    • Sequential node On the basis of holding nodes, directory nodes are sequentially labeled
  3. EPHEMERAL – Temporary directory node
    • This parameter is valid for a session. If a session expires, it will be deleted. If a client disconnects from ZooKeeper and reconnects within a short period of time, the node will be deleted if the client logs in to ZooKeeper in a different session.
  4. EPHEMERAL_SEQUENTIAL – Temporary sequentially numbered directory nodes
    • Temporary sequential nodes, on the basis of temporary nodes, the directory nodes are sequentially labeled
  5. TTL node
    • Can set the timeout time, once exceeds the limit time will be deleted, ZK background maintenance of a scheduled task, every 60 seconds to scan once, expired deletion. 【 is disabled by default, only through system configuration zookeeper. ExtendedTypesEnabled = true open, unstable 】
  6. Container – Container node
    • New in 3.5.3, new container nodes are the same as ordinary nodes. Once a new child node is created under a container node, the empty child node will be deleted in the future. Zk background maintains a scheduled task, scanning every 60 seconds.

Node-listening mechanism

The ZooKeeper server provides the node monitoring function so that the client can detect the change of the node on the server. Zookeeper provides three listening modes:

  1. Listen on a node (get -w /path)
    • When the node is deleted or the data is modified, the server will notify the client of the node change [only the node is deleted or the data in the node is changed, but the result of the data change will not be informed, and the client needs to manually obtain new data]
  2. Listen for a directory (ls -w /path)
    • When a child node is added or deleted from the directory, the server notifies the client
  3. Listen for recursive child nodes of a directory (ls-r-w /path)
    • When the root directory is deleted, data is modified, or subdirectories are added or deleted, the system notifies the client

Note: All zooKeeper listening mechanisms take effect only once. Once triggered, listening events will be removed. If you are listening on recursive child nodes of a directory, each node fires once.

Usage scenarios of ZooKeeper

Based on the two main features of ZooKeeper, it can be easily done

  1. Distributed configuration Center/Distributed registry: The use of ZK persistent node and listening mechanism, can be very convenientconfiguration/ The application of informationIt is stored on persistent nodes in ZK, and eventually, through a listening mechanism, notifies applications when configuration changes occur. [Suitable for applications with small volume]
  2. Distributed lock: the use of zK node name unique, can be relatively simple to achieve distributed lock
    • Fair locking: Temporary persistence of sequential nodes allows the object to acquire the lock to sequentially add sequential nodes under a parent node and then listen for the last newly added node.Lock grab condition: the node with the smallest serial number obtains the lock. When the previous node releases the lock and is deleted, the next node listens to the message and snatches the lock.

    • Unfair lock: An unfair lock is implemented by listening on a temporary node with a fixed name. When the request comes in, first try to acquire the lock (get-w /path), if not, loop and wait. The server then notifies all the threads waiting to acquire the lock when the thread holding the lock finishes, and all the waiting threads compete for the lock. In high concurrency, this method is inefficient. Reason: stampeding effect, only one thread can acquire the lock at a time, but wakes up all threads.

    • Read/write lock: When the application has both the cache layer and the database layer, the cache and data read/write may be inconsistent under high concurrency.In this case, we need to add a read/write lock to ensure that the read operation cannot be read after the write operation is complete. The write operation cannot be performed when the read operation is in progress. The write operation can be performed only after the read operation is complete.

      Listening rule: in the case of continuous read, listening on the last write, if there is no previous write, no need to lock; Write the case, listen on your last node;

Zookeeper cluster

  1. Copy zoo.cfg and modify the following configuration (single-node, pseudo-cluster configuration for example)
    • dataDir=/zookeeper/zk1
    • ClientPort =2181 ### Client connection port
    • Server. 1=localhost:2888:3888 ## 2888 Cluster data synchronization communication port 3888 Election communication port
    • server.2=localhost:2889:3889
    • server.3=localhost:2890:3890
  2. Create a myID file in the dataDir directory.
  3. Start the services separately

Zookeeper cluster election process *The interview often ask

Take the ZooKeeper node with three nodes as an example. When two or more services are started, the election begins:

  1. First round election: when the ZK service starts, a ballot is sent to all other nodes configured in the configuration file. The contents of the ballot include its serverId, ZXID of the latest data, etc. At the same time, it will receive votes from other applications and compare the received votes with its own votes.
  2. Second election: each ZK service sends to other nodes the vote with the highest priority among all the votes it has received (including its own). If more than half of the votes are cast for a service, the election ends. The elected machine is the leader.
  3. When other services connect to the cluster, they automatically set themselves as followers, discovering that the leader already exists.

Data synchronization in the ZooKeeper cluster

After the election, the client connects to the Zookeeper-leader node to write data. At this time, the leader node will open a socket and wait for the follower node to connect. After the follower node connects to the leader node, The leader node continuously sends new data to the follower node. If no new data is written and the old data has been synchronized, the leader node periodically sends blank socket information to keep the nodes connected.