Zookeeper is an open source distributed coordination service designed to encapsulate complex and error-prone distributed consistency services into an efficient and reliable set of primitives and provide them to users with some simple interfaces.
Zookeeper is a typical distributed data consistency solution that distributed applications can use to implement functions such as data subscription/publishing, load balancing, naming services, cluster management, distributed locks, and distributed queues.
Basic concepts of ZooKeeper
The cluster character
Zookeeper provides three roles: Leader, Follower, and Observer. The Leader server provides read and write services for the client. Besides the Leader, other machines, including followers and Observers, can provide read services. The only difference is that the Observer does not participate in the Leader election process and does not participate in the half-write success strategy for write operations, so the Observer can improve cluster performance without affecting write performance.
Session
A client connection is a long TCP connection between a client and a server
Data Node (Znode)
ZooKeeper stores all data in memory. The data model is a ZNode Tree. The path divided by slash (/) is a ZNode, for example, /app/path1. Each ZNode stores its own data content, along with a set of attributes.
version
Zookeeper maintains a data structure called Stat for each Znode. Stat records three versions of the Znode data. These are version (version of the current ZNode), cversion (version of the current ZNode child node), and Uncomfortable (ACL version of the current ZNode).
Event Listener (Watcher)
Wathcer (Event listener) is a very important feature in Zookeeper. Zookeeper allows users to register some Watcher on a specified node. When certain events are triggered, the Zookeeper server will notify interested clients of the event. This mechanism is an important feature of Zookeeper to implement distributed coordination service
Permission Control (ACL)
Zookeeper uses Access Control Lists (ACLs) to Control permissions. The ACL defines the following permissions:
- CREATE: permission to CREATE child nodes.
- READ: Permission to obtain node data and child node list.
- WRITE: permission to update node data.
- DELETE: indicates the permission to DELETE a child node.
- ADMIN: Sets the ACL permission of a node.
It is important to note that the CREATE and DELETE permissions are permissions for child nodes
Server Roles (TODO)
Leader
The Leader server is the core of the Zookeeper cluster. Its main tasks are as follows:
- The unique scheduling and standing of transaction requests ensures the sequential processing of cluster transactions
- A scheduler for each server in a cluster
Follower
The Ollower server is a follower in the State of the Zookeeper cluster. Its main tasks are as follows:
-
Handles client non-transactional requests (reading data) and forwards transaction requests to the Leader server.
-
Participate in the voting of the transaction request Proposal.
-
Participate in Leader election voting.
Observer
The Observer is a new server role introduced by ZooKeeper since version 3.3.0. Literally, the server acts as an observer — it watches the latest state changes in the ZooKeeper cluster and synchronizes those state changes.
The Observer server works in the same way as followers. The Observer server processes non-transaction requests independently, while the Leader server forwards transaction requests to the Leader server for processing. The only difference from followers is that the Observer does not participate in any form of voting, including transaction request Proposal voting and Leader election voting. Simply put, an Observer server provides only non-transactional services, which are typically used to improve the non-transactional capabilities of a cluster without compromising its transactional capabilities.
Zookeeper data model — Znode
All Zookeeper information is stored on data nodes, called ZNodes. A Znode is the smallest storage unit in Zookeeper. A Znode can be attached to a Znode to form a Znode Tree, called a Znode Tree.
The type of the Znode
-
Persistent node: is the most common node type in Zookeeper. A persistent node means that a created node keeps a server until the node is deleted.
-
Persistent sequential node: a sequential persistent node with the same properties as a persistent node, but with additional features in the order. The order feature essentially means that when a node is created, the node name is followed by a numeric suffix to indicate its order.
-
Temporary node: a node whose life cycle is tied to that of the client session. When the client session ends, the node is deleted. Unlike persistent nodes, temporary nodes cannot create child nodes.
-
Temporary sequential nodes: Sequential temporary nodes, like persistent sequential nodes, are created with numeric suffixes after their names.
Znode status information
The content of ZNode consists of two parts: node data content and node status information. In the figure, [persistent node order] is the data content, and the others are state information.
- CZxid is Create ZXID, which indicates the transaction ID of the node when it is created.
- Ctime is Create Time, which indicates the creation Time of a node.
- MZxid is Modified ZXID, which indicates the transaction ID of the node when it was last Modified.
- Mtime is Modified Time, which indicates the Time when a node was last Modified.
- PZxid indicates the transaction ID of the node when the child node list was last modified. PZxid is updated only when the child list changes, not when the child content changes.
- Cversion Indicates the version number of a child node.
- Datspanning represents the content version number.
- AclVersion Identifier ACL version ephemeralOwner indicates the session sessionID when the temporary node was created. If it is a persistent node, the value is 0
- DataLength indicates the dataLength.
- NumChildren indicates the number of direct child nodes.
Watcher- Data change notification
Zookeeper uses the Watcher mechanism to implement publish/subscribe for distributed data. A typical publish/subscribe model system defines a one-to-many subscription relationship that allows multiple subscribers to listen to a topic object at the same time and notify all subscribers when the topic object itself changes in state so that they can act accordingly.
Zookeeper introduces the Watcher mechanism to implement this distributed notification function. Zookeeper allows clients to register a Watcher listener with the server. When the Watcher is triggered by some specified event on the server, an event notification is issued to the client for distributed notification.
The Watcher mechanism of Zookeeper consists of client thread, client WatcherManager, and Zookeeper server.
The specific working process is as follows:
- The client is addressing the
Zookeeper
The server is registered at the same timeWatcher
Objects are stored on the clientWatcherManager
In the middle. - when
Zookeeper
Server triggeringWatcher
After the event, a notification is sent to the client. - Client thread from
WatcherManager
Extract the correspondingWatcher
Object to perform the callback logic.
ACL- Ensures data security
The ACL mechanism can be understood from three aspects: Permission mode (Scheme), authorization object (ID), and Permission. “Scheme: ID: Permission” is usually used to identify a valid ACL.
Permission mode: Scheme
-
IP: Permission control is performed based on the IP address granularity. For example, IP :192.168.0.110 indicates that the permission control is performed based on the IP address. The IP address mode can be configured based on the network segment. For example, “IP :192.168.0.1/24” indicates the permission control for the network segment 192.168.0.*.
-
Digest: is the most commonly used permission control mode. It is more consistent with our understanding of permission control. The Digest uses a permission identifier in the form of “username:password” to configure permission. After the permission id is configured in the form of username:password, Zookeeper implements SHA-1 encryption and BASE64 encoding respectively.
-
World: is the most open permission control mode. This permission control mode has almost no effect. All users can access data on ZooKeeper without any permission verification. In addition, the World mode can also be considered a special Digest mode, which has a single permission identifier: “World: Anyone.”
-
Super: As the name implies, Super user means Super user, which is also a special Digest mode. In Super mode, the Super user can perform any operation on any ZooKeeper data node.
Authorization object: ID
Permissions model | Authorization object |
---|---|
IP | It is usually an IP address or IP segment, for example, 192.168.10.110 or 192.168.10.1/24 |
Digest | User-defined, usually username:BASE64(SHA-1(username:password)) for example, zm: sdfndsllNDLksfn7c = |
Digest | There is only one ID: Anyone |
Super | The super user |
permissions
Permission refers to the operations that can be performed after passing the permission check. ZooKeeper provides the following data operation rights:
- CREATE (C) : Permission to CREATE a data node, allowing authorized objects to CREATE child nodes under the data node.
- DELETE (D) : indicates the permission to DELETE child nodes of the data node, allowing authorized objects to DELETE child nodes of the data node.
- READ (R) : READ permission on a data node, allowing authorized objects to access the data node and READ its data content or a list of child nodes.
- WRITE (W) : indicates the update permission of a data node. Authorized objects are allowed to update the data node.
- ADMIN (A) : indicates the management permission of A data node, allowing authorized objects to perform ACL-related operations on the data node.