Hello everyone, I am Glacier ~~

From today, we officially update the “Mastering Zookeeper EPR Series” topic content. First, we will make a simple review and summary of the basic content of Zookeeper. The overall content of this article is as follows.

What is Zookeeper?

To put it simply, Zookeeper is an open source distributed collaborative service system. The design goal of Zookeeper is to encapsulate complex and error-prone distributed collaborative services, abstract out an efficient and reliable primitive interface, and provide a series of simple interfaces for other services to call. Other applications can implement distributed applications by using the interfaces provided by Zookeeper. For example: distributed lock, distributed election, master/slave switchover and so on. We’ll talk about these in more detail in the field.

The history of Zookeeper

Zoookeeper was originally developed by Yahoo to solve the problem of collaboration between multiple internal systems, and was later made open source and donated to the Apache organization. Zookeeper has since become widely used in the open source community. Here, I list a few well-known open source projects that use Zookeeper.

  • Hadoop: Use Zookeeper to provide the high availability mechanism of NameNode.
  • HBase: Zookeeper is used to ensure that there is only one Master node in the cluster, save the RegionServer list in the cluster, and save the location of the HBase: Meta table.
  • Kafka: Zookeeper is used to manage the members of a group. Zookeeper is used to provide the controller node election mechanism.
  • Dubbo: A registry that implements distributed governance services using Zookeeper.
  • SpringCloud: Implement the microservice registry using Zookeeper.

There are also many open source projects that use Zookeeper as a distributed collaborative project. Due to the large number, I will not list them here. You can check them on the Internet by yourself.

Zookeeper application scenarios

To put it simply, Zookeeper can be used in the following scenarios.

  • Configuration management.
  • DNS service.
  • Group member management.
  • Various distributed locks.
  • Distributed elections.
  • Data consistency scenarios.

Note, however, that Zookeeper is only suitable for storing key data related to collaboration and is not suitable for storing large amounts of data.

Zookeeper service usage

Generally, when using Zookeeper, we connect to and use Zookeeper through the Zookeeper library. The Zookeeper client is responsible for interacting with the Zookeeper cluster.

Data model for Zookeeper

Essentially, Zookeeper’s data model is a hierarchical model, as shown below.

This kind of hierarchical model is common in file system, and this hierarchical model and key-value model are two mainstream data models. The main considerations for Zookeeper using the file system model are as follows.

  • The tree structure of a file system facilitates the representation of hierarchical relationships between data.
  • The tree structure of the file system facilitates the allocation of independent namespaces for different applications.

In Zookeeper, each node in the hierarchy is called a ZNode, which differs from a file system in that each node can hold data and each node has a version number that increases from 0.

Next, let’s look at a concrete example of a Zookeeper node.

For example, in the figure above, there are three subtrees that apply to app1, app2, and APP3. The app1 subtree implements a simple group membership protocol, that is, each client p creates a ZNode under the /app1 node, and each process creates a ZNode with /app1/p_1, /app1/p_2,… /app1/p_n. If/APP1 /p_n exists, the Pn process is running properly.

Zookeeper node type

In general, Znode nodes can be divided into the following four categories.

A Znode node can be either persistent or temporary.

  • Persistent Znode: After the creation of the node, the node will not be lost even if the Zookeeper cluster or Zookeeper client fails.
  • Temporary Znode: If the Zookeeper client is down or the client does not send messages to the Zookeeper cluster within the specified timeout period, the node disappears.

Znode nodes can also be sequential, which means that each node is associated with a unique monotonically increasing integer. This monotonically increasing integer is the suffix of Znode node names, such as /app1/p_1, /app1/p_2, etc. Therefore, Znode has the following two categories:

  • Persistent sequential ZNodes: In addition to the features of persistent ZNodes, Znode names are sequential.
  • Temporary sequential Znode: In addition to the features of temporary ZNodes, Znode names are sequential.

Ok, that’s enough for today. I’m Glacier. See you next time