1. An overview of the
1.1 Distributed Applications
Distributed application refers to the way in which applications are distributed on different computers and work together to complete a task over the network. Take javaEE to implement an e-commerce website as an example:
- Single application: all functions are written in one project; Package as a runnable WAR package; The deployment of this WAR package completes the functionality of the entire site.
- Distributed applications: different functions are written in different projects; Packaged into multiple runnable WAR packages; Multiple running services work together to complete the functionality of the entire site.
As shown in the figure above, this e-commerce website contains four modules (also known as services) : user management, commodity management, order management and payment management. In distributed applications, many functions are completed by the cooperation of multiple services. How to efficiently and orderly cooperate among services has become an important issue in distributed development.
-
Analogy:
- Single application –> All work done by one person
- Distributed applications –> Many tasks are performed by multiple people working together
1.2 What is ZooKeeper
ZooKeeper is a distributed, open source distributed application coordination service. It is an open source implementation of Google’s Chubby and an important component of Hadoop and Hbase. It is mainly used to solve some problems often encountered in distributed applications.
There are many kinds of animals in the Zoo. The animals here can be compared to a variety of services in a distributed environment. ZooKeeper is responsible for managing these services.
1.3 architecture
The architecture of ZooKeeper is shown as follows:
A ZooKeeper cluster consists of a group of Server nodes. In this group, one node serves as Leader and the other nodes serve as followers. The management and ZooKeeper architecture have the following key points:
-
Read operation that reads one of the follwers directly and returns directly,
-
For write operations, the ZooKeeper cluster performs the following operations:
- These requests are sent to the Leader node
- Data changes on the Leader node are synchronized to other Follower nodes in the cluster
-
After receiving a data change request, the Leader node first writes the change to the local disk for recovery. Changes are not applied to memory until all write requests are persisted to disk.
-
Note: Data persisted to the hard disk is only used for data recovery when the service restarts.
-
If the Leader node fails, the cluster automatically elects a new Leader node.
1.4 Storage structure and hierarchical namespaces
ZooKeeper has a hierarchical namespace that is very simple and intuitive, similar to the directory structure of a file system. ZNode is the most important concept.
In ZooKeeper, each Namespace is called a ZNode. Each ZNode contains a path and its associated properties and data, as well as a list of children that inherit from the node. Different from traditional file systems, data in ZooKeeper is stored in memory to achieve high throughput and low latency of distributed synchronization services.
ZNode features:
1.4.1 Ephemeral and Permanent Nodes
- Permanent node (default) : The permanent node does not disappear once it is successfully created.
- Temporary node: After a temporary node is created, it is automatically deleted if the client and server are disconnected
1.4.2 Ordered Nodes ** (**Sequence Nodes)
Zookeeper supports the creation of ordered nodes, which automatically concatenates an increasing number after a zNode name.
1.4.3 Updates and Watches of Nodes
The ZooKeeper client can monitor zNode data changes. When zNode data changes, the ZooKeeper client automatically notifies the listening client.
02. Zookeeper application scenarios
2.1 Distributed Lock
2.1.1 Why Use distributed Locks
In the case of high concurrency, a method can only be executed by the same thread at the same time. In the case of single-node deployment of traditional single application, Java concurrency related apis (such as ReentrantLcok or synchronized) can be used for mutual exclusion control. But with the needs of the development of business, the original single single after deployment of the system is evolved into a distributed system, due to the distributed system multithreading, multi-process and distribution on different machines, it will make the original single deployment of concurrency control lock strategy fails, in order to solve this problem requires a cross JVM mutex mechanism to control access to a Shared resource, This is the problem that distributed locks solve.
2.1.2 Distributed Lock based on ZooKeeper
A distributed lock corresponds to a node of ZooKeeper. Each client thread that needs to obtain the distributed lock creates a temporary ordered node under this node. In this case, there are two situations:
- If the temporary sequential node created is the first node under the folder, the distributed lock is considered successful.
- If the temporary sequential node created is not the first node in the folder, the current lock is considered to have been acquired by another client thread, and the node needs to enter the blocking stateWakes up the current thread when the previous node in the sequence releases the lock.
2.2 Configuration Center
For the configuration information used in the project (such as database address, user name, password…) There are two ways to deal with it:
-
Placing it directly in the project configuration file has the following disadvantages:
- It is easy to leak sensitive information (such as online user name, password…).
- If the configuration information (for example, the database password changes) changes, you need to pack and go online again
-
Configuration center to maintain configuration information in the configuration center
-
Process:
- When the service is started, it goes directly to the configuration center for configuration information
- When the configuration information changes, you can modify the data in the configuration center. The configuration center pushes the new configuration information to the tomcat service without restarting the service.
-
2.3 Registry
In the production environment, you need to call the interface (such as HTTP interface) of another business server from one business server (such as Tomcat), so you need to know the IP address and port of the called party. There are three solutions:
-
IP& port write dead in project (not recommended)
-
Pros: Simple and straightforward.
-
Disadvantages: If the IP address and port of the caller changes or the number of deployed nodes changes, the caller code needs to be modified to bring the caller back online
-
analogy
- App1 – > A classmate
- App2 – > classmate B
- IP+ Port – > Phone number
- Call — > Student A dials student B’s phone number
-
-
Reverse proxy server, such as Nginx
-
Advantage: The caller only needs to configure the Nginx address, and does not care about the real IP address and port of the caller.
-
Analogy:
- Nginx – > monitor
- Reverse proxy server forwarding – > Student A asks the monitor to send A message to student B
-
-
Registries, such as ZooKeeper
-
Advantages: The caller does not need to configure the IP address of the called party, and the information of the called party can be automatically updated when it changes.
-
Process:
- Register, app2 starts to send IP, port and other information to the registry
- When APP1 starts, deregister and pull app2 information (IP, port, etc.)
- Initiating a call (such as an HTTP request)
-
The distributed framework Dubbo recommends using ZookeeEPr as the registry.
-
Analogy:
- Registration center –> Class teacher
- Registration — all students will give their mobile phone number to the class teacher when they enter the school
- Pull the target server information and initiate the call — > Student A asks the head teacher for student B’s phone number and then calls
-
2.4 other
In addition to the three common usage scenarios listed above, ZooKeeper can also implement name servers, queues, barriers, atomic types, and elections, which are not described here.