This section describes Zookeeper

  • The basic concept
  • Zookeeper characteristics
    • Sequential consistency
    • atomic
    • Single system image
    • reliability
  • Zookeeper application scenarios
    • A distributed lock
    • The naming service
    • Data publishing and subscription
  • They are used

The basic concept

  • Zookeeper can act as a registry and distributed lock

  • Zookeeper is a member of the Hadoop system

  • An odd number of servers are used to build a Zookeeper cluster

  • Primitives:

    • Terms for operating systems and computer networks
    • A process consisting of several instructions used to perform a function
    • Primitives are indivisible. That is, the execution of primitives must be continuous and cannot be interrupted during execution
  • Zookeeper is an open source distributed coordination service framework:

    • Complex and error-prone distributed consistent services are encapsulated into an efficient set of primitives that are provided to callers in a series of easy-to-use interfaces
  • Zookeeper provides a high availability, high performance, and stable distributed data consistency solution, which is usually used in the following aspects:

    • Publish and subscribe to data
    • Load balancing
    • The naming service
    • Distributed coordination and notification
    • Cluster management
    • Master the election
    • A distributed lock
    • Distributed queue
  • Zookeeper stores data in memory, providing high performance. Applications with read operations over write operations perform better. Read operations over write operations are a typical scenario for coordinating services, because write operations result in synchronization between all servers

Zookeeper characteristics

Sequential consistency

  • Transaction requests from the same client are eventually applied to Zookeeper in strict order

atomic

  • The results of all transaction requests are applied consistently across all machines in the cluster
  • That is, all the machines in the cluster either successfully applied a transaction or did not apply a transaction

Single system image

  • No matter which Zookeeper server the client is connected to, the server data obtained is in the same mode

reliability

  • Once a change request is applied, the result of the change is persisted until overwritten by the next change

Zookeeper application scenarios

  • Zookeeper is used to save data. However, Zookeeper is not suitable for storing large amounts of data

A distributed lock

  • Obtain distributed locks by creating unique nodes
  • The lock is released when the party that acquired the lock finishes executing the relevant code or after an outage

The naming service

  • Generate globally unique ids from Zookeeper sequence nodes

Data publishing and subscription

  • Zookeeper’s Watcher mechanism makes it easy to publish and subscribe data
  • Publish data to the monitored Zookeeper node, and other machines can dynamically update the configuration by listening for the changes of Zookeeper nodes

They are used

  • Kafka:

    • Zookeeper provides Kafka with the registration of brokers and topics and load balancing of multiple partitions
  • HBase:

    • Zookeeper ensures that there is only one Master in the HBase cluster
    • Saves and provides regionServer status information
  • Hadoop:

    • Zookeeper provides high availability support for NameNode