Zookeeper
Distributed information sharing center
Based on CP model in CAP theorem
I. Application scenarios
-
Master the election
-
The temporary node becomes Master after being created. The temporary node is monitored after being created
-
-
A distributed lock
-
Exclusive lock: If the temporary node is successfully created, the lock is obtained. If the temporary node is not successfully created, monitor the temporary node
-
A Shared lock:
-
Use temporary order nodes
-
If it is a read request and all nodes with a smaller sequence number are read requests, it indicates that the shared lock is obtained
-
If the request is a write request, if the serial number of the request is not the smallest, wait
To avoid herding, each node should focus only on the node with its own number 1 smaller
-
-
-
Publish/subscribe common configuration
-
Machine list
-
Switch configuration
-
Database Configuration
Small amount of data, dynamic change of content, consistent configurations on all machines
-
-
Cluster management
-
Statistics the number of machines in a cluster
-
Get the machine on and off line
-
Monitor machine status
-
-
The naming service
- Get unique ID
Second, the role of
- Leader
- Provides read/write services
- A unique scheduler and handler of transaction requests to ensure the sequence of transaction processing
- Follower
- Provide read service
- Forward transaction requests to the Leader
- Participate in the Proposal vote
- Participate in Leader election
- Observer
- Provide read service
- Do not participate in the Proposal and Leader election
Session
-
A TCP long connection between a client and a server that maintains a valid session using heartbeat detection
-
The client can send requests and receive responses to the server
-
The client can receive notifications of Watch events from the server
Data node Znode
There are two meanings: 1. machine node 2. data unit
-
The smallest data unit in Zookeeper. A Znode can be attached to a Znode to form a Znode Tree. – In the Znode Tree, the/split path is used
-
Responsible for saving their own data content, attribute information
-
classification
- Persistent node
- Ephemeral is the temporary node
- Sequential nodes Sequential
-
The specific type
- Persistent node
- Exists until manually deleted
- Persistent order node
- Nodes are followed by numeric suffixes to indicate order
- Temporary node
- Nodes that will be automatically deleted
- Bound to a client session. When the session ends, the node is deleted
- Cannot create child nodes
- Temporary sequence node
- A temporary node with an order, followed by a numeric suffix, to indicate the order
- Persistent node
-
Attribute information
-
CZxid: transaction ID when the node is created
-
Ctime: time when a node is created
-
MZxid: indicates the transaction ID of the node that was last modified
-
Mtime: indicates the time when the controller was last modified
-
PZxid: transaction ID of the child node list that was last changed
-
Cversion: indicates the version number of a child node
-
DataVersion: Content version number
-
AclVersion: indicates the ACL version number
-
EphemeralOwner: The session ID of the temporary node – 0 if it is a persistent node
-
DataLength: indicates the dataLength
-
NumChildren: indicates the number of direct child nodes
Grandson nodes don’t count
-
-
version
-
Version: indicates the current Znode version
-
Cversion: indicates the version of a Znode child node
-
Uncomfortable: The ACL version of the current Znode
-
5. Event listener Watcher
○ Push-pull mode
○ The server sends the Watcher event notification to the client. After receiving the notification, the client proactively obtains the latest data from the server
-
Used to implement the publish/subscribe function
-
Clients can register Watcher with a specified Znode, and when a particular event is triggered (when the Znode content or its child node list changes), the server notifies interested clients of the event
-
The registration process
- The client sends a registration request to Zookeeper
- The client stores the Watcher object in its own WatcherManager
-
The notification process
-
Zookeeper triggers the Watcher event and sends a notification to the client
-
The client thread pulls the corresponding Watcher object from the WatcherManager to run the callback logic
-
6. Permission control policy ACL
Access Control Lists
1. Permission Scheme
- IP
- Permissions are controlled by IP address granularity
- Authorization Object: IP address or IP address segment
- Digest
- Configure the permission tag in the format of “username:password”
- Authorization object: user-defined, usually username:BASE64(SHA-1(username:password)). For example, zm: sdfndsllndlksfn7c =
- World
- The most open mode of permission control has little effect
- To whom: Only one, Anyone
- Super
- Super user, can operate on any Znode
- Authorized object: super user
2. The access Permission
- CREATE: Permission to CREATE child nodes
- READ: Permission to obtain node data and child node list
- WRITE: permission to update node data
- DELETE: deletes the permission of the child node
- ADMIN: Sets the ACL permission of a node
ZAB Agreement
○ ZAB has only one Proposer compared to Paxos, so there is no phase in which a Proposer has more than one proposal, so it does not vote on the proposals and only needs to determine whether the Follower is alive and can run a transaction
○ Followers send a Commit message to run the transaction, which is similar to a two-phase Commit of 2PC
1. The transaction
- Operations that can change the Zookeeper server status: Create, delete, and update a Znode
- For each transaction request, a transaction ID(ZXID) is assigned to represent a -64 digit number that identifies the global order of the requests
2. Message broadcast process
- The Leader receives the client request, generates a Proposal for the transaction, and sends it to followers
- The Follower receives the Proposal, writes it to the hard disk as a transaction log, and sends an ACK to the Leader
- After receiving more than half of the ACK responses, the Leader broadcasts a Commit message to all followers
- The Follower receives the Commit message and commits the transaction
Crash recovery
1. The Leader election
Ensure that the new Leader elected has the maximum ZXID
-
Transactions that have been committed on the Leader server are eventually committed by all servers
-
Discard transactions that are committed (not committed) only on the Leader server
2. Data synchronization process
-
The Leader prepares a queue for each Follower
-
The Leader sends a Proposal message to the followers if the transaction is not synchronized by the followers
-
The Leader sends a Commit message immediately after each Proposal message
-
After the followers have synchronized all unsynchronized transactions from the Leader, the Leader adds the followers to the list of available followers