This section describes Zookeeper
- The basic concept
- Zookeeper characteristics
-
- Sequential consistency
- atomic
- Single system image
- reliability
- Zookeeper application scenarios
-
- A distributed lock
- The naming service
- Data publishing and subscription
- They are used
The basic concept
-
Zookeeper can act as a registry and distributed lock
-
Zookeeper is a member of the Hadoop system
-
An odd number of servers are used to build a Zookeeper cluster
-
Primitives:
- Terms for operating systems and computer networks
- A process consisting of several instructions used to perform a function
- Primitives are indivisible. That is, the execution of primitives must be continuous and cannot be interrupted during execution
-
Zookeeper is an open source distributed coordination service framework:
- Complex and error-prone distributed consistent services are encapsulated into an efficient set of primitives that are provided to callers in a series of easy-to-use interfaces
-
Zookeeper provides a high availability, high performance, and stable distributed data consistency solution, which is usually used in the following aspects:
- Publish and subscribe to data
- Load balancing
- The naming service
- Distributed coordination and notification
- Cluster management
- Master the election
- A distributed lock
- Distributed queue
-
Zookeeper stores data in memory, providing high performance. Applications with read operations over write operations perform better. Read operations over write operations are a typical scenario for coordinating services, because write operations result in synchronization between all servers
Zookeeper characteristics
Sequential consistency
- Transaction requests from the same client are eventually applied to Zookeeper in strict order
atomic
- The results of all transaction requests are applied consistently across all machines in the cluster
- That is, all the machines in the cluster either successfully applied a transaction or did not apply a transaction
Single system image
- No matter which Zookeeper server the client is connected to, the server data obtained is in the same mode
reliability
- Once a change request is applied, the result of the change is persisted until overwritten by the next change
Zookeeper application scenarios
- Zookeeper is used to save data. However, Zookeeper is not suitable for storing large amounts of data
A distributed lock
- Obtain distributed locks by creating unique nodes
- The lock is released when the party that acquired the lock finishes executing the relevant code or after an outage
The naming service
- Generate globally unique ids from Zookeeper sequence nodes
Data publishing and subscription
- Zookeeper’s Watcher mechanism makes it easy to publish and subscribe data
- Publish data to the monitored Zookeeper node, and other machines can dynamically update the configuration by listening for the changes of Zookeeper nodes
They are used
-
Kafka:
- Zookeeper provides Kafka with the registration of brokers and topics and load balancing of multiple partitions
-
HBase:
- Zookeeper ensures that there is only one Master in the HBase cluster
- Saves and provides regionServer status information
-
Hadoop:
- Zookeeper provides high availability support for NameNode