Mp.weixin.qq.com/s/QMOdpPzxc…

Tell your wish ~ the original author: : https://juejin.cn/post/6844903873296105479Copy the code

preface

It has been a while since spring was the subject of several previous articles, and the topic of high concurrency distribution is also a big one to say a lot about, but it is also a hole in the hole, and then it will be followed by P slowly. I like most people is a learner, not a preacher, more is their own learning summary and not authoritative, summary, as far as possible to let people see the simple is my intention, and then there is a mistake to change, no plus mian is the best, here also hope we make progress together.

The high concurrency distributed development technology system has been very large, from the use of Domestic Internet enterprises, it can be found that RPC, Dubbo, ZK are the most basic skills requirements. Are you still stuck in the Dubbo registry of Zookeeper? And how does it work? What about classic application scenarios? If you don’t have your own thoughts or understanding when answering the first three questions, I think I can help you to get started and deepen the knowledge, so that you can better answer the questions in the interview.

Without further ado, let’s get to the point

1. Challenges in a concurrent environment

Remember when we learned multithreading, there is a graph on the Internet is also very interesting

In fact, we change threads into processes, which is equivalent to running a program on each service. The same application runs on multiple server clusters, in order to solve the problem that a single service cannot handle high concurrency. And trying to deal with these situations, we face a lot of these problems

For example, we now have a cluster of three servers. How can we ensure that the configuration information shared by all the machines is consistent?

When one machine dies, how do other machines sense the change and take over the task?

The number of users suddenly exploded, need to add machines to relieve the pressure, how to complete the addition of machines without restarting the cluster?

How can a distributed system efficiently collaborate with multiple services to write to the same network file (the network is not real-time, it is unreliable and has latency)?

This is where we need a tool that allows processes to collaborate, similar to thread collaboration

2. Zookeeper

① How did Zookeeper get its name

Many open source projects on Apache take animal images as ICONS, such as Tomcat as a cat, hive as a wasp, etc. Zookeeper’s job is to coordinate the actions of these animals

2 Introduction to Zookeeper

Zookeeper is a high-performance coordination service for distributed applications. Its features are data stored in memory and persistence implemented in logs. Its memory is similar to a tree structure, with high throughput and low latency, which can help us achieve a distributed unified configuration center, service registration, distributed locks and other servers that constitute the ZooKeeper service must understand each other. They maintain state images in memory, as well as transaction logs and snapshots in persistent storage. The ZooKeeper service is available as long as most servers are available. The client connects to a single ZooKeeper server. The client maintains a TCP connection through which it sends requests, gets responses, gets monitoring events, and sends ticks. If the TCP connection to the server breaks, the client will connect to another server.

3 Installing Zookeeper (in Linux)

1. JDK version need above 1.6 2. Download: https://archive.apache.org/dist/zookeeper/zookeeper-3.5.2/zookeeper-3.5.2.tar.gz3. Add the zoo.cfg4 to the conf directory after decompression. Start the server bin/ zkserver. sh start5. Bin/zkcli. sh -server 127.0.0.1:2181zoo. CFG has three key configurations: tickTime=2000: the basic time of a heartbeat; dataDir: the storage place of data and logs clientPort: the port numberCopy the code

4. Characteristics of Zookeeper

1. Simple data structure

Similar to the Unix file system tree structure, each directory becomes a Znode node, but it is different from the file system, it can be regarded as a folder, also can be regarded as a file to store data, but we usually still have to call it node, do not call folder so depreciated.

Note: The names of the children of the same node cannot be the same, and there is a standard name. Its path does not have the concept of relative path, but is an absolute path. Any path starts with a “/”, and finally, the size of the data it stores is limited.

Scan code to pay attention to “Not only Huang”

Reply to “Ebook”

Get 20GB classic IT ebooks

2. Data model characteristics

Hierarchical namespace: a unix-like file system with a “/” root and a node that can contain associated data and child nodes. Absolute path Znode: a unique name, a standard name, and four types: persistent, sequential, temporary, and temporary sequential

3. Naming conventions

Node names can use any Unicode character except for the following restrictions:

1. The null character (\u0000) cannot be part of a pathname; 2. The following characters cannot be used because they do not display well or are rendered in confusing ways :\ U0001-\ U0019 and \ U007F-\ u009F. 3. Do not use the following characters :\ UD800-UF8FFF, \ uFFF0-UFFFF. 4. "." The character can be used as part of another name, but ". And ".." Cannot be used alone to indicate nodes on a path because ZooKeeper does not use relative paths. The following is invalid: "/ A /b/. / c" or "C /a/b/.. /". 5. Zookeeper is the reserved node name.Copy the code

4. Some commands

Since my computer runs Windows, I found a Windows version of ZooKeeper to demonstrate

First explain the contents of each catalogue

Bin --> run directory containing Linux and Windows running programs conf --> zookeeper configuration zoo.cfgcontrib --> other components and distributions dist --> maven --> Maven-created jar files, docs, lib, recipe, SRC, zooKeeper, because zooKeeper is JavaCopy the code

CMD in bin directory, and then zkclient. CMD. When I do not know how to learn, generally speaking, I can input help, -help, -h commands to get help

Because the commands are relatively simple, so I will not demonstrate, the only thing to pay attention to is the path “/” problem, such as ls/is the root directory, create /zk 123, and the conditions of each command, such as create must provide a parent node, delete node cannot have child nodes, etc

5. An important feature of Zookeeper — order

ZooKeeper assigns each update a number that reflects the order of all ZooKeeper transactions. This strict order means that complex synchronization primitives can be implemented on the client to interpret the ticks configuration in czxid, Version, zoo.cfg

  • Zxid: Each write request in Zookeeper corresponds to a unique transaction ID, called Zxid, which is global and orderly. If Zxid1 is less than Zxid2, Zxid1 must occur before Zxid2

  • The version Numbers: The version number, the write request to the node will increase the three version numbers of the node (actually the pattern is similar to optimistic lock), dataVersion (number of changes to ZNode data), CVersion (number of changes to ZNode child nodes), and aclVersion (number of changes to ZNode ACL)

  • Ticks: When using multi-server Zookeeper, the server uses a “tick” to define the time of events, such as status upload and session timeout, which is indirectly exposed through the minimum session timeout (default is the tick time x2). If the client requests exceed this time, the client can no longer connect to the server

  • Real time: Zookeeper does not use real time

You can use stat path or ls2 to view this information

CZxid: the time when the node was created zxidcTime: the time when the node was created mzxIDmTime: the time when the node was last modified pZxid: the time when the node was last modified ZxidcVersion: the number of changes of the node's child nodes DatidCVersion: AphemeraOwner: session ID of the temporary node owner. The non-temporary value is 0dataLength: data length of the node numChildren: number of child nodesCopy the code

Copy the code

These data tell us from the side that ZooKeeper is a coordinator

6. The second feature of ZooKeeper — replicability

Data can be copied and backed up. Zookeeper provides tools and mechanisms to quickly build a cluster. You only need to set some configurations to ensure reliable services and avoid single points of failure


7. The third feature of ZooKeeper — speed

Some of zooKeeper’s features can be applied to large distributed systems

3. The zookeeper’s theory

1 Session mechanism of ZooKeeper

The Session Session

1. Each client connects to a session, and ZooKeeper assigns a unique session ID2. The client sends heartbeat messages at a specified interval to keep the session valid. 3. If the client does not receive heartbeat messages within the session timeout period, the client is invalid (tickTime twice by default) 4. Session volume requests are sequential FIFO (first-in, first-out)Copy the code

② Data structure of ZNode

Node data: Basic storage information (status, configuration, location, etc.) Node metadata:statThe data size of some commands is limited to 1 MBCopy the code

3 Type of a ZNode

1. The persistent node is created by using create path value 2. Temporary node: create-ePath value3. Sequence node: create-sNote 1. Session When the session expires, the temporary node is deleted. 2. Sequential nodes are created with 10-digit decimal ordinals, and each parent node has a counter, which is also limited and overflows by 3 after 2147483647. The sequential node still exists at the end of the sessionCopy the code

④ Watch monitoring mechanism

The client can set watch on zNodes to monitor zNode changes, including adding, deleting, modifying, and querying zNode changes. You can view the changes through stat path and LS2 path get path

There are four conditions that trigger the Watch event: Create, delete, change, and Child.

Key features of Watch

1. One-time only: Watch will be deleted immediately after it is triggered. If you want to continuously monitor changes, you need to continuously provide setting of Watch, which is also a note for Watch 2. Orderliness: The client can view the change result only after receiving the watch notificationCopy the code

Notes for Watch

2. Obtain events and send watch to obtain watch. These requests may be delayed, so it is not absolutely reliable to obtain every change of each node. A watch object is notified only once. If a watch registers multiple interfaces at the same time (exists,getData) and the node is deleted, the event is valid for both exists and getData, but the watch is called only onceCopy the code

Blocking thread wake up mechanism – Clients can passively receive notifications about the status of other client processes

⑤ Features of ZooKeeper

1. Sequential Consistency, which ensures that client operations take effect sequentially; 2. Atomicity, update success or failure. No partial results. 3. A single system image that clients see the same regardless of which server they connect to 4. Reliability. Data changes are not lost unless overwritten by the client. 5. Timeliness to ensure that the data read by the client of the system is the latest.Copy the code