What is a distributed system

The concept of distribution has always been a difficult hurdle for back-end engineers to avoid. Today, we will take a look at what distributed systems are and what distributed technologies we need to learn.

According to Baidu Baike, a distributed system is a software system built on the Internet. Because of the characteristics of software, distributed systems have a high degree of cohesion and transparency. As a result, the difference between networks and distributed systems is more about high-level software (especially operating systems) than hardware.

The following reproduced from: www.cnblogs.com/dudu0614/p/…

Start with the birth of distributed systems

We often hear about the awesome server-side system of an Internet application, such as QQ, wechat, or Taobao. So, an Internet application of the server side system, in the end what is awesome? Why does massive user access complicate a server-side system? This article seeks to explore the basic concepts of server-side system technology from the ground up.

Carrying capacity is the reason for the existence of distributed systems

When an Internet service gains popularity, the most obvious technical problem is that the server is very busy. When 10 million people visit your site every day, no matter what kind of server hardware you use, it can’t be hosted on a single machine. Therefore, when Internet programmers solve server-side problems, they must consider how to use multiple servers to provide services for the same Internet application, which is the source of the so-called “distributed system”.

However, the problem caused by large numbers of users accessing the same Internet service is not simple. On the face of it, the most basic requirements for meeting many users’ requests from the Internet are the so-called performance requirements: users respond slowly to web pages, or slow actions in online games, etc. These requirements for “service speed” actually include the following components: high throughput, high concurrency, low latency, and load balancing.

High throughput means that your system can handle a large number of users at the same time. We’re looking at the number of users that the entire system can serve simultaneously. This throughput is certainly not possible with a single server, so multiple servers need to work together to achieve the desired throughput. In the collaboration of multiple servers, how to effectively use these servers, so as not to make a part of the server become a bottleneck, thus affecting the processing capacity of the whole system, this is a distributed system, in the architecture of the need to carefully weigh the problem.

High concurrency is an extended requirement for high throughput. When we are hosting a large number of users, we want each server to work as well as it can without unnecessary consumption and waiting. However, software systems are not simply designed to handle “as many” tasks at once. Many times, our program will incur additional consumption by choosing which task to handle. This is also a problem that distributed systems solve.

Low latency is not a problem for sparse services. However, if we need to be able to return results quickly even when a large number of users are accessing, this is much more difficult. Because in addition to a large number of user access may cause requests in the queue, there may be too long queue length, resulting in memory exhaustion, bandwidth full and other spatial problems. If a retry policy is adopted because of queuing failure, the overall delay will be higher. So a distributed system uses a lot of request sorting and distribution to get more servers to handle requests as quickly as possible. However, due to a large number of distributed system, must need to put the user’s request after many distribution, the delay may be because the distribution and transfer operations, become higher, so the distributed system in addition to distribute the requests, but also try to find a way to reduce the number of distribution level, so that the request can be processed as soon as possible.

Since Internet users come from all over the world, they may come from different networks and lines with different delays in physical space, and may come from different time zones in terms of time. Therefore, to effectively cope with the complexity of user sources, it is necessary to deploy multiple servers in different Spaces to provide services. At the same time, we also need to allow simultaneous requests to be efficiently hosted by multiple different servers. The so-called load balancing is an inherent task of distributed systems.

Because distributed system, is almost the most basic method to solve the Internet business carrying capacity problem, so as a server programmer, master distributed system technology becomes extremely important. However, the problem of distributed system can not be easily solved by learning to use several frameworks and libraries, because when a program is run on one computer, it becomes coordinated run on countless computers at the same time, which will bring great difference in development, operation and maintenance.

Distributed system to improve the capacity of the basic means

Hierarchical model (routing, proxy)

The simplest way to use polymorphic servers to collaborate on computing tasks is to have each server complete all requests and then send them to any server at random. The earliest Internet applications, the DNS polling is this: when a user to enter a domain name to access a web site, the domain name will be interpreted as one of multiple IP addresses, then the site access request, has been sent to the corresponding IP server, so that multiple servers (multiple IP address) can handle a large number of user requests together.

However, simply asking for random forwarding will not solve everything. For example, many of our Internet businesses require users to log in. After logging in to a certain server, the user will initiate multiple requests. If we randomly forward these requests to different servers, the user’s login status will be lost, resulting in some request processing failures. It is not enough to simply rely on a layer of services to forward, so we will add a batch of servers, which will forward the cookies of users or the login credentials of users to the later specific business processing servers.

In addition to the need to log in, we also found that a lot of data is needed to deal with the database, and our data often can only be centralized in a database, otherwise in the query will lose the data stored on other servers. So we also tend to separate the database into a batch of dedicated servers.

At this point, a typical three-tier structure emerges: access, logic, and storage. However, this three-tier result is not a panacea. For example, when we need to let the user interactive online (online games is typical), then split in different logic on the server online status data, is impossible to know each other, so we need to do a special similar interactive server dedicated system, let the user login, records a data to it at the same time, also indicates that a user login on a server, And all interactive operations, to go through the interactive server, can correctly forward the message to the target user’s server.

For example, when we use the online forum (BBS) system, it is impossible for us to write articles into only one database, because too many people’s reading requests will drag the database to death. We often write to different databases by forum section, or to multiple databases simultaneously. In this way, the article data can be stored on different servers to handle the large number of operation requests. Users, however, when reading the article, then you need a special program, where to find the specific articles on a server, then we will set up a special agent layer, put all the articles requests to it first, by it in accordance with our default storage plan, to find the corresponding database access to data.

Based on the above example, distributed systems typically have three layers, but are often designed with multiple layers based on business requirements. In order to forward requests to the right process, we have designed many processes and servers dedicated to forwarding requests. These processes are often named as Proxy or Router, and a multi-tier structure often has a variety of Proxy processes. These proxy processes, in many cases, connect back and forth through TCP. However, TCP is simple, but it is not easy to recover after a failure. And the network programming of TCP is also a little bit complicated. As a result, people have devised a better mechanism for interprocess communication: message queues.

Although a powerful distributed system can be built through various Proxy or Router processes, its management complexity is very high. So people on the basis of the layered mode, come up with more methods to make the layered mode of the program more simple and efficient methods.

Concurrency model (multi-threaded, asynchronous)

When we write server-side programs, we know that most programs will handle multiple requests arriving at the same time. So we can’t compute the output from a simple input like HelloWorld. Because we’re going to get a lot of inputs at the same time, we’re going to have to return a lot of outputs. In the process of these processes, we often encounter “wait” or “block” situations, such as our program to wait for the database processing results, waiting to request results from another process and so on… If we processed requests one after the other, this idle wait time would be wasted, resulting in increased response latency for users and an extreme drop in overall system throughput.

So there are two typical solutions in the industry for handling multiple requests at the same time. One is multithreading, the other is asynchronous. In early systems, multithreading, or multi-processing, was the most common technique. This technique is easier to code because the code in each thread must be executed in sequence. But because you have multiple threads running at the same time, you can’t guarantee the order of code between them. This is a serious problem for logic that needs to process the same data, as in the simplest case of displaying the number of views of a particular news item. Two ++ operations run at the same time, and it is possible that the result only adds 1 instead of 2. So in multithreading, we often have to add a lot of data locks, and these locks in turn may cause thread deadlocks.

Therefore more popular than multi-thread asynchronous callback model in the following, in addition to the multithreaded deadlock problem, asynchronous can also solve the multithreading, thread repeatedly switching lead to unnecessary overhead problem: each thread requires a separate stack space, in a multithreaded parallel operation, the stack data may require a copy of the back and forth, the extra consumed CPU. And because each thread takes up stack space, memory consumption is also huge when there are a large number of threads. The asynchronous callback model solves these problems well, but the asynchronous callback is more like the “manual” parallel processing and requires developers to figure out how to do it themselves.

Asynchronous callbacks are based on non-blocking I/O operations (network and file), so we don’t have to “get stuck” in a function call when we call read/write functions, but return “data or no data” results immediately. The Epoll technology of Linux makes use of the mechanism of the underlying kernel, so that we can quickly “find” connection \ files with data to read and write. Because each operation is non-blocking, our program can handle a large number of concurrent requests with only one process. Because there is only one process, so all data processing, its order is fixed, it is impossible to appear in multithreading, two functions of the statement staggered execution of the situation, so there is no need for a variety of “locks”. From this perspective, asynchronous non-blocking techniques greatly simplify the development process. With only one thread and no overhead such as thread switching, asynchronous non-blocking is the preferred choice for many systems with high throughput and concurrency requirements.

Int epoll_create (int size); // Create a handle to epoll. Size tells the kernel how many listeners there are

Int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);

int epoll_wait(int epfd, struct epoll_event * events, int maxevents, int timeout);

buffering

In Internet services, most user interactions need to return results immediately, so there is a requirement for delay. And similar services such as online games, the delay is required to be reduced to tens of milliseconds. So to reduce latency, buffering is one of the most common techniques used in Internet services.

In early WEB systems, if every HTTP request was read and written to the database (MySQL) once, the database would quickly stop responding because the connection count was full. Because the general database, support the number of connections are only a few hundred, and the WEB application of concurrent requests, easily to several thousand. This is also a lot of bad design of the website people a lot of dead card the most direct cause. To minimize connection and access to the database, many buffering systems have been designed to store the results of a query from the database to a faster facility and, if there are no associated changes, read directly from there.

The most typical WEB application caching system is Memcache. Due to PHP’s threaded structure, there is no state. Early PHP itself didn’t even have a way to manipulate “heap” memory, so persistent state had to be stored in another process. Memcache is a simple and reliable open source tool for storing temporary state. Many PHP applications now use processing logic that reads data from the database and then writes it to Memcache. When the next request comes in, try to read the data from Memcache first, which can greatly reduce the database access.

However, Memcache itself is a separate server process, which itself does not have special clustering capabilities. In other words, these Memcache processes are not directly organized into a unified cluster. If a Memcache is not enough, we manually use code to assign which Memcache process to which data. For a truly large distributed site, managing such a buffer system is a tedious task.

So people started thinking about designing more efficient buffering systems: in terms of performance, every Memcache request had to travel over the network to pull data from memory. This is definitely a bit wasteful, because the requester’s own memory can also hold data. This is where many caching algorithms and techniques make use of requester memory, the simplest of which is the LRU algorithm, which places data in heap memory in a hash table structure.

Memcache’s lack of clustering is also a pain point for users. So a lot of people started designing how to keep the data cache on different machines. The simplest idea is called read/write separation, which means that each cache write is written to multiple buffer processes, whereas reads can be read to any process at random. This works well when business data has a significant read/write imbalance gap.

However, not all businesses can simply use read-write separation to solve the problem, such as some online interactive Internet businesses, such as communities, games. The data read and write frequencies of these services are not very different and require high latency. So the idea was to combine local memory with the memory cache of remote processes, allowing data to be cached at two levels. At the same time, a data is not replicated in all cache processes at the same time, but distributed in multiple processes according to a certain law. The most popular algorithm used for this distribution is known as “consistent hashing.” The advantage of this algorithm is that when a process fails, there is no need to reposition all the cached data in the entire cluster. As you can imagine, if our data cache distribution is simply modulating the number of processes by the ID of the data, then once the number of processes changes, the location of each process where the data is stored may change, which is not good for the fault tolerance of the server.

Orcale has a product called Coherence, which is designed to run on caching systems. This product is a commercial product that supports collaboration using local memory caching and remote process caching. The cluster process is fully self-managed and supports user-defined computations (processor capabilities) in the process where the data cache resides, making it not just a cache but a distributed computing system.

Storage Technology (NoSQL)

I believe CAP theory is familiar to everyone. However, in the early days of Internet development, when everyone was still using MySQL, many teams were racking their brains on how to make the database store more data and bear more connections. Even in many businesses, where the primary data storage is files, the database becomes a secondary facility.

However, when NoSQL emerged, people suddenly realized that the data format of many Internet businesses is so simple that many times the root does not need the complex tables of relational databases. Requests for indexes are often based only on primary indexes. And more complex full-text search, the database itself can not do. Therefore, NoSQL is now the preferred storage facility for a large number of highly concurrent Internet services. The earliest NoSQL databases are MangoDB, and the most popular one seems to be Redis. Even some teams consider Redis part of the buffering system, in effect recognizing Redis’s performance advantages.

In addition to being faster and carrying more capacity, NoSQL stores data in such a way that it can only be retrieved and written to a single index. The distribution advantage of this requirement constraint is that we can define the process (server) where the data is stored by this primary index. In this way, the data of one database can be conveniently stored on different servers. In the inevitable trend of distributed systems, the data storage layer has finally found a way to distribute.

Manageability problems caused by distributed systems

Distributed systems are not just a bunch of servers running together. Compared to a single machine or a cluster with a small number of servers, there are special problems waiting to be solved.

Hardware failure rate

The so-called distributed system, certainly is not only a server. Assuming that the average time a server fails is 1%, when you have 100 servers, almost one of them is failing. This analogy may not be accurate, but as you add more hardware to your system, hardware failures can go from an accident to a certainty. Usually when we write functional code, we don’t think about what to do in case of hardware failure. If you’re writing distributed systems, you’re going to have to deal with this problem. Otherwise, it is likely that only one server will fail and the entire cluster of hundreds of servers will not work properly.

In addition to the server’s own memory and hard disk failures, network line failures between servers are more common. And this kind of failure also may be occasional, or will automatically recover. Faced with this problem, it is not enough to simply eliminate the “malfunctioning” machine. The network may come back up again after a while, and your cluster may lose more than half of its processing power due to this temporary failure.

How to make distributed system, in all kinds of possible failure at any time, as far as possible automatic maintenance and maintenance of external services, has become the programming to consider the problem. With this failure scenario in mind, we have to consciously design the architecture with redundant, self-sustaining features in mind. These are not business requirements on the product, they are purely technical functional requirements. It is one of the most important responsibilities of a server-side programmer to make the right requirements and implement them correctly.

Resource utilization optimization

When the hardware capacity of a cluster of distributed systems, consisting of many servers, reaches its limit, the natural idea is to add more hardware. However, it is not so easy for a software system to “add” hardware to improve performance. Because software works on multiple servers, it requires complex and meticulous coordination. When expanding a cluster, we often have to shut down the entire cluster, modify various configurations, and finally restart a cluster with new servers.

Since there may be some user data in the memory of each server, attempting to modify the configuration of services provided in the cluster at runtime is likely to cause memory data loss and errors. Therefore, run-time scaling is relatively easy for stateless services, such as adding Web servers. But for stateful services, such as online games, simple run-time scaling is almost impossible.

In addition to capacity expansion, there are also requirements for capacity reduction in distributed clusters. When the number of users drops and server hardware resources become idle, we often need these idle resources to be utilized and put into some new service cluster. The capacity reduction is similar to a fault in a cluster that requires Dr. The difference is that the time point and target of the capacity reduction are predictable.

As the expansion and shrinkage in the distributed cluster capacity, and hope can online operation as far as possible, this led to very complex technical issues need to deal with, such as cluster related configuration how to efficient modified correctly, how to operate the stateful process, how to in the process of expansion shrinkage capacity to insure the normal communication between nodes in the cluster. As a server-side programmer, it takes a lot of experience to specifically develop a set of problems caused by changes in the cluster state of multiple processes.

Software service content updated

It is now popular to use the term “iteration” in the Agile development model to refer to the continuous updating of a service to meet new requirements and fix bugs. If we manage only one server, updating the program on that server is as simple as copying the package and changing the configuration. But if you’re trying to do the same for hundreds of servers, it’s impossible for every server to log in.

Every distributed system developer needs a batch installation and deployment tool for server-side programs. However, our installation involves a lot more than copying binaries and configuration files. For example, open the firewall, create a shared memory file, modify the database table structure, rewrite some data files and so on…… Some even install new software on the server.

If we take software updates and version upgrades into consideration when developing server-side programs, we will make certain planning in advance for the use of configuration files, command line parameters, and system variables, which can make the installation and deployment tools run faster and more reliable.

In addition to the installation and deployment process, there is an important issue of data between different versions. When we upgrade the version, some persistent data generated by the old version of the program is usually in the old data format; However, if we modify the data format in the upgraded version, such as the data table results, the data in the old format must be converted and rewritten into the data format of the new version. As a result, when designing the data structure, we need to consider the structure of these tables, and use the simplest and most direct expression to make future modification easier. Or do you anticipate the scope of the changes early on, and specify some fields, or use other formats to store data.

In addition to persistent data, if there are client programs (such as hit APP), the upgrade of these client programs often cannot be synchronized with the server. If the upgrade content includes the modification of communication protocol, this causes the problem that we have to deploy different server-side systems for different versions. To avoid maintaining multiple servers at the same time, we tend to use so-called “version-compatible” protocol definitions in software development. How to design the protocol to have good compatibility is a problem that the server program needs to consider carefully.

Data statistics and decision making

Generally speaking, log data of distributed systems are collected together and collected in a unified manner. However, when the size of the cluster reaches a certain point, the volume of data in these logs can become terrifying. In many cases, it takes more than one day for the computer to run the daily log statistics. As a result, logging statistics has become a very professional activity.

The classic distributed statistical model is Google’s Map Reduce model. This model has both flexibility and the ability to use a large number of servers for statistical work. However, the disadvantage is that the ease of use is often not good enough, because the statistics of these data are very different from our common SQL data table statistics, so we often end up throwing data into MySQL to do more detailed statistics.

Due to the large number of distributed system logs, as well as the increase of log complexity. We will have to master technologies like Map Reduce to really do statistics on distributed systems. And we also need to find ways to improve the efficiency of statistical work.

A fundamental approach to the manageability of distributed systems

Directory Service (ZooKeeper)

Distributed system is a whole composed of many processes, each member of the whole, will have some states, such as their own responsible module, their own load, to grasp some data and so on. The data related to other processes becomes very important during fault recovery and capacity expansion.

Simple distributed systems can record such data through static configuration files: connection mappings between processes, their IP addresses and ports, and so on. However, a distributed system with high degree of automation requires that these state data be stored dynamically. In this way, applications can do their own disaster recovery and load balancing.

Some programmers write their own DIR service (directory service) to keep track of the running status of processes in the cluster. Processes in a cluster are automatically associated with the DIR service. In this way, during DISASTER recovery, capacity expansion, and load balancing, the destination for sending requests can be adjusted based on the data in the DIR service to bypass faulty machines or connect to new servers.

However, if we just use a process to do the job. The process becomes a “single point” of the cluster — meaning that if the process fails, the whole cluster may fail. So the directory service that stores the cluster state also needs to be distributed. Fortunately, we have ZooKeeper, an excellent open source software, which is a distributed directory service.

ZooKeeper can simply start an odd number of processes to form a small directory service cluster. This cluster provides all other processes with the ability to read and write its huge “configuration tree”. This data is not only stored in one ZooKeeper process, but is hosted by multiple processes based on a very secure algorithm. This makes ZooKeeper an excellent distributed data storage system.

As a result of the ZooKeeper data storage structure, is a similar file directory tree system, so we often make use of its function, each process are bound to one of the “branch” on, and then by checking the “branch”, forward the request to the server, can be simple to solve route requests by who (to do). You can also mark the status of the process’s load on these “branches” so that load balancing is easy.

Directory service is one of the most critical components in distributed system. ZooKeeper is a great open source software that does just that.

Message queue services (ActiveMQ, ZeroMQ, Jgroups)

When two processes communicate across machines, we almost always use protocols like TCP/UDP. But writing cross-process communications directly using network apis is cumbersome. In addition to writing a lot of low-level socket code, we also have to deal with a series of problems such as how to find the process to exchange data, how to ensure the integrity of the packet is not lost, if the other process of communication dies, or what to do if the process needs to restart, and so on. These problems cover a series of requirements, such as Dr Capacity expansion and load balancing.

In order to solve the problem of communication between processes in distributed system, people summed up an effective model, which is message queue model. The message queue model abstracts the interaction between processes into the processing of individual messages for which we have “queues”, or pipes, to hold messages temporarily. Each process can access one or more queues from which messages can be read (consumed) or written (produced). With a cached pipeline, we can make changes to the process state without fear. When the process is running, it will automatically consume the message. The routing of the messages themselves is determined by the queues they are placed in, turning the complex routing problem into a problem of managing static queues.

The general message queue service provides two simple “post” and “receive” interfaces, but the management mode of message queue itself is more complex, generally there are two kinds. Part of the message queue service advocates point-to-point queue management: there is a separate message queue between each pair of communication nodes. The advantage of this approach is that messages from different sources are not affected by each other, so that the message cache space of other queues will not be occupied by too many messages in one queue. Moreover, the program that processes the message can define its own priority of processing – collecting first, processing more of one queue and less of others.

However, this kind of point-to-point message queuing can add a large number of queues as the cluster grows, which is a complex matter for memory usage and operation management. Therefore, the more advanced message queue service can initially allow different queues to share memory space, and message queue address information, creation and deletion are automated. These automations often rely on the “directory service” described above to register information such as the physical IP address and port corresponding to the queue ID. For example, many developers use ZooKeeper as a central node for message queuing services; Software like Jgropus maintains a cluster state of its own to store the past and present of each node.

The other kind of message queue is similar to a public mailbox. A message queue service is a process in which any consumer can send or receive messages. In this way, the use of message queue is more convenient and the operation and maintenance management is more convenient. However, the latency is relatively high in this case, where a message is transmitted at least twice before it is processed. And because there are no scheduled delivery and collection constraints, it is also easier to BUG.

No matter what kind of message queue service is used, inter-process communication is a problem that must be solved in a distributed server system. Therefore, as a server programmer, when writing distributed system code, the most used code is based on message queue driver. This led to the addition of “message-driven beans” to the specification in EJB3.0.

Transaction systems

Transactions are one of the most difficult technical problems to solve in distributed systems. Because a process can be distributed among different processing processes, any one process can fail, and the failure problem needs to cause a rollback. Most of this rollback involves multiple other processes. This is a diffuse multi-process communication problem. To solve transaction problems on distributed systems, two core tools are necessary: one is a stable state storage system; Another is a convenient and reliable public address system.

The state of any step in the transaction must be visible to the entire cluster and must be disaster resilient. This requirement is typically handled by the cluster’s directory service. If our directory service is robust enough, we can synchronously write the processing status of each transaction to the directory service. Again, ZooKeeper can play an important role in this area.

If a transaction is interrupted and needs to be rolled back, the process involves multiple steps that have already been performed. Perhaps this rollback only needs to be rolled back at the entry (where the data needed to save the rollback is stored), or it may need to be rolled back at the various processing nodes. If it is the latter, then the node in the cluster with the exception is required to broadcast a “Rollback! The transaction ID is XXXX “. The underlying layer of this broadcast is typically hosted by the message queue service, and software such as Jgroups provides the broadcast service directly.

While we are now talking about transactional systems, the fact is that distributed systems often require “distributed locking” functionality that can be done simultaneously by this system. A “distributed lock” is a constraint that allows each node to check and then execute. If we have an efficient and monadic directory service, then the lock state is essentially a “step transaction” state record, and the rollback defaults to “pause and try again later.” This way of “locking” is simpler than transaction processing, so the reliability is higher, so now more and more developers are willing to use this “locking” service, rather than to implement a “transaction system”.

Automatic Deployment Tool (Docker)

Since the biggest requirement of distributed systems is to change service capacity at run time (which may require interruption of service) : expansion or reduction of capacity. When some nodes fail in a distributed system, new nodes are needed to recover. If these are still like the old server management, through filling in the form, declaration, into the machine room, install the server, deployment software…… This way of doing things is definitely not efficient.

In the context of distributed systems, we tend to manage services in a “pool” manner. We pre-subscribe to a batch of machines and run the service software on some of them, while others serve as backups. Obviously, our batch of servers cannot serve only one business, but will provide multiple different business bearer. Those backup servers become a common backup “pool” for multiple services. As business requirements change, some servers may “exit” service A and “join” service B.

This frequent service change relies on highly automated software deployment tools. Our operations staff should be equipped with deployment tools provided by developers, not thick manuals, to conduct such operations. More experienced development teams will unify all the underlying business frameworks so that most deployment and configuration tools can be managed by a common system. In the open source community, there are similar attempts, most widely known as the RPM installation package format, but the PACKAGING method of RPM is still too complicated to meet the deployment requirements of server-side programs. So then came the programmable, universal deployment system represented by Chef.

However, when NoSQL emerged, people suddenly realized that the data format of many Internet businesses is so simple that many times the root does not need the complex tables of relational databases. Requests for indexes are often based only on primary indexes. And more complex full-text search, the database itself can not do. Therefore, NoSQL is now the preferred storage facility for a large number of highly concurrent Internet services. The earliest NoSQL databases are MangoDB, and the most popular one seems to be Redis. Even some teams consider Redis part of the buffering system, in effect recognizing Redis’s performance advantages.

In addition to being faster and carrying more capacity, NoSQL stores data in such a way that it can only be retrieved and written to a single index. The distribution advantage of this requirement constraint is that we can define the process (server) where the data is stored by this primary index. In this way, the data of one database can be conveniently stored on different servers. In the inevitable trend of distributed systems, the data storage layer has finally found a way to distribute.

To manage a large number of distributed server-side processes, we do need to spend a lot of time optimizing their deployment management. Unified running specifications of server-side processes are the basic conditions for automated deployment management. We can adopt Docker technology according to the “operating system” as the specification; It is also possible to adopt some PaaS platform technologies based on “Web Applications” as specifications; Or you can define some more specific specifications and develop your own complete distributed computing platform.

Logging Service (log4j)

Server-side logging has always been an important and easily overlooked problem. Many teams start with logging as an aid to debugging and BUG removal. But it soon becomes apparent that, once the service is up and running, the log is almost the only effective means of knowing what is going on in the server-side system at run time.

Although we have a variety of Profile tools, most of these tools are not suitable for running services because they can seriously degrade their performance. So we need to analyze the logs more often. Although logs are essentially lines of text, they are highly valued by development and operations personnel because of their flexibility.

The log itself, conceptually, is a very vague thing. You can open any file and write some information. But modern server systems, generally do some standardized requirements for logs: logs must be a row, so that more convenient statistical analysis in the future; Each line of log text should have some common headers, such as date and time is the basic requirement; The log output should be graded, such as fatal error/warning/info/debug/trace and so on, program can adjust the output level at runtime, so it can save the log print consumption; The log header usually needs some header information, such as user ID or IP address, to quickly locate and filter a batch of log records, or some other fields for filtering to narrow the log view scope, which is called the dyeing function. Log files also need to be “rolled back”, that is, multiple files of a fixed size are kept, so that the hard disk will not be full after running for a long time.

As a result of these requirements, the open source community provides many logging component libraries for games, such as the well-known Log4J and the numerous log4X family libraries, which are widely used and well-regarded tools.

However, compared with the log printing function, the log collection and statistics function is often neglected. As a distributed system programmer, it must be expected to be able to collect statistics from a centralized node to the whole cluster log situation. Some log statistics can be obtained repeatedly in a short period of time to monitor the health of the entire cluster. To do this, you have to have a distributed file system that holds a steady stream of incoming logs (often sent over UDP). In this file system, a statistics system similar to the Map Reduce architecture is required to quickly collect statistics and alarm massive log information. Some developers use Hadoop directly, while others use Kafka as a log storage system and build their own statistical programs.

Log service is a dashboard and periscope of distributed operation and maintenance. Without a reliable logging service, the health of the entire system can be out of control. So no matter how many or few nodes you have in your distributed system, significant effort and dedicated development time must be devoted to setting up a system for automated statistical analysis of logs.

The problems and solutions of distributed system in development efficiency

According to the above, distributed systems need to add many additional non-functional requirements to the functional requirements of the business. These non-functional requirements are often designed and implemented for the stable and reliable operation of a multi-process system. All of this “extra” work tends to make your code more complex, and if you don’t have good tools, it can make your development less productive.

Microservice framework: EJB, WebService

When we talk about server-side software distribution, communication between server processes is inevitable. However, the communication between service processes is not a simple message can be completed. It also involves message routing, encoding and decoding, reading and writing of service state, and so on. It would be too tiring to develop the whole process yourself.

So the industry has launched a variety of distributed server-side development frameworks very early, the most famous is “EJB” – enterprise Javabeans. Any technology with the title “enterprise” is often part of the need for distributed, and EJB technology is also a technology for distributed object invocation. If we need multiple processes to cooperate to complete a task, we need to split the task into multiple “classes” whose objects can then live in various process containers to provide services cooperatively. This process is very object-oriented. Each object is a “microservice” that provides some distributed functionality.

Others, however, move towards learning the basic model of the Internet: HTTP. Therefore, there are various WebService frameworks, from open source to commercial software, all have their own WebService implementation. This model, which reduces complex routing, codec and other operations to a common HTTP operation, is a very effective abstraction. Developers develop and deploy multiple WebServices to Web servers to complete the construction of a distributed system.

Whether we study EJBs or WebServices, we actually need to simplify the complexity of distributed calls. The complexity of distributed call is the integration of disaster recovery, capacity expansion, load balancing and other functions into cross-process call. Therefore, the use of a set of common code for all cross-process communication (call), unified realization of disaster recovery, capacity expansion, load balancing, overload protection, state cache hit and other non-functional requirements, can greatly simplify the complexity of the entire distributed system.

General micro service framework, we will be in the routing phase, the status of the node in the cluster all observed, such as which address running on which service process, the service process of load conditions, availability, and then for the services of a stateful, also use similar consistency hash algorithm, to strive to improve the cache hit ratio. When the status of nodes in the cluster changes, all nodes in the micro-service framework can obtain the status of the change as soon as possible and re-plan the future service routing direction according to the current status, so as to realize automatic routing and avoid those nodes with high load or failure.

Some microservices frameworks also provide tools like IDL to convert “skeleton” and “pile” code, so that when writing remote callers, there is no need to write those complicated network related code, all the transport layer, code layer code is automatically written. EJB, Facebook Thrift, Google gRPC all have this capability. In a framework with code generation capabilities, we write a distributed functional module (perhaps a function or a class) as easily as we write a native function. This is definitely a very important efficiency gain in distributed systems.

Asynchronous programming tools: coroutines, Futrue, Lamda

Programming in distributed systems, you will inevitably encounter a number of “callback” apis. Because distributed systems involve a lot of network communication. Any service command can be decomposed into multiple processes and combined through multiple network communications. Because of the asynchronous, non-blocking programming model, our code often encounters “callback functions”. Callbacks, however, are an asynchronous programming model that is very unreadable. Because you can’t read the code from beginning to end to understand how a business task is accomplished step by step. Code belonging to a business task is divided into many callback functions that are concatenated throughout the code due to multiple non-blocking callbacks.

Furthermore, we sometimes choose to use the “observer mode”, where we register a large number of “event-response functions” in one place, and then emit an event wherever a callback is needed. Such code is harder to understand than simply registering callback functions. Because the response function corresponding to the event is usually not found where the event was emitted. These functions are always stored in separate files, and sometimes they change at run time. The event names themselves are often unintelligible, because when your program needs hundreds or thousands of events, it’s almost impossible to come up with a name that’s easy to understand.

In order to solve the problem of callbacks, which are destructive to code readability, many different modifications have been invented. The most famous of these is the “coroutine”. We are used to solving problems with multiple threads, so we are familiar with writing code synchronously. Coroutines continue this habit, but unlike multithreading, they do not run “at the same time.” They simply Yield() to execute other coroutines where they need to block, and Resume() when the block is finished. This is equivalent to adding the contents of the callback function to the Yield() call. This way of writing code, very similar to synchronous writing, makes the code very readable. The only drawback is that the Resume() code still needs to run in what is called the “main thread”. The user must call Resume() to recover from the block. Another disadvantage of coroutines is that they require stack saving. After switching to another coroutine, temporary variables on the stack also require extra space, which limits the way the coroutine code is written and prevents developers from using large temporary variables.

Another way to improve the writing of callbacks is called the Future/Promise model. The basic idea is to write all the callbacks together at once. This is a very useful programming model. Instead of killing callbacks completely, it allows you to concentrate callbacks from all over the place. In the same piece of code, you can clearly see how the various asynchronous steps are sequenced, or executed in parallel.

Finally, the LAMda model is popular in many applications of the JS language. In other languages, specifying a callback function is a lot of work: The Java language has to design an interface and then do an implementation, which is a five-star level of work; C/C++ supports function Pointers, which are relatively simple, but also easy to cause code to read; Scripting languages, which are better, also define a function. Writing the contents of the callback function directly where the callback is called is the easiest way to develop and read. More importantly, LAMda generally means that closures, that is, the call stack of such callback functions, are stored separately. Many of the state-saving variables required for asynchronous operations, such as the creation of a “session pool”, are not required here, but can be implemented naturally. This is similar to coroutines.

Whichever asynchronous programming method is used, the code complexity is always higher than the code that is called synchronously. Therefore, when we write distributed server code, we must carefully plan the code structure, to avoid random addition of functional code, resulting in code readability is damaged. Unreadable code is unmaintainable code, and server-side code with lots of asynchronous callbacks is more prone to this.

Cloud service model: IaaS/PaaS/SaaS

In the process of complex distributed system development and use, how to operate and maintain a large number of servers and processes is always a problem throughout. No matter the use of microservices framework, or unified deployment tools, log monitoring services, because a large number of servers, to centralized management, is very difficult. The main reason behind this is that there’s a lot of hardware and a lot of networks that chop up the logical computing power into a lot of little pieces.

With the improvement of computing power of computers, the emergence of virtualization technology, but can be divided into computing units, more intelligent unified. One of the most common is IaaS: when we can run multiple virtual server operating systems on a single server hardware, the amount of hardware we need to maintain drops exponentially. And the popularity of PaaS technology, so that we can for a specific programming model, unified deployment and maintenance of the system operating environment. There is no need for another server to install the operating system, configure the running container, upload the running code and data. Before there was a unified PaaS, installing a large number of MySQL databases used to take a lot of time

And energy to work.

As our business models mature enough to be abstracted into fixed pieces of software, our distributed systems become easier to use. Our computing power is no longer code and libraries, but a cloud — SaaS — that delivers services over the network, so that users don’t need to do maintenance or deployment work at all. They just need to apply for an interface, fill in the expected capacity, and use it directly. Not only does this save a lot of effort to develop the corresponding functionality, but it also hands off a lot of operations and maintenance to SaaS maintainers, who can do this more professionally.

In the evolution of operation and maintenance model, from IaaS to PaaS to SaaS, its application scope may be increasingly narrow, but the convenience of use has been improved exponentially. This also proved that the work of software labor, but also through the division of labor, to more specialized, more subdivided direction to improve efficiency.

conclusion

Summarize the solutions to distributed system problems

The purpose of building distributed systems

  • Increased throughput of the overall architecture, serving more concurrency and traffic.
    • Heavy traffic processing, using clustering technology to spread the load of large-scale concurrent requests across different machines.
  • Improve system stability and make system availability higher.
    • Critical service protection. Improve the availability of background services and isolate faults to prevent the domino effect (avalanche effect). If the traffic is too heavy, services need to be degraded to protect critical services

Improve system performance

  • Cache system: cache partition, cache update, cache hit
  • Load balancing system (gateway system) : load balancing, service routing, and service discovery
  • Asynchronous invocation: message queues, message persistence, asynchronous transactions
  • Data mirroring: Data synchronization, read/write diversion, and data consistency
  • Data partitioning: partitioning policy, data access layer, data consistency

Caching system

  • Can improve the ability of fast access.
  • There are caches from the front-end browser, the network, back-end services, underlying databases, file systems, hard disks, and cpus.
  • For distributed cache system, a cache cluster is required first, in which a Proxy is required to do cache sharding and routing

Load balancing

  • Is the key technology to do horizontal expansion.

The asynchronous call

  • The request is queued through message queue, the front-end request is peak-cutting, and the back-end request is processed according to its own processing speed.
  • Advantages: Increases system throughput
  • Disadvantages: Poor real-time performance, but also introduces the problem of message loss, so the message needs to be persisted, which will cause stateful nodes, thus increasing the difficulty of service scheduling.

Data partitioning and data mirroring

  • Data is divided into multiple areas in a certain way, and different data shares the traffic of different areas. This requires a data routing middleware, which leads to very complex cross-library Join and cross-library transaction.
  • Data mirroring: Backup multiple databases. Multiple nodes can provide data read and write functions and realize data synchronization among nodes. Disadvantages: Data consistency issues.
  • In the early stage, the data mirroring mode of read and write separation is used, and in the later stage, the data mirroring mode of separate database and separate table is used.

Improve system stability

  • Service separation (service governance) : Service invocation, service dependency, service isolation
  • Service redundancy (service scheduling) : elastic scaling, failover, and service discovery
  • Traffic limiting degradation: asynchronous queue, degradation control, and service fusing
  • High availability architecture: multi-tenant system, Dr Multi-activity, and HIGH availability services
  • High availability o&M: full stack monitoring, DevOps, automated o&M

Service split

  • Isolation failure
  • Reuse service modules
  • After the service is split, the problem of dependency between service calls is introduced.

Service redundancy

  • Remove single points of failure and support elastic scaling and failover of services.
  • For stateful services, redundancy of stateful services can lead to increased complexity.
    • When one of them scales flexibly, data needs to be replicated or re-sharded, and data needs to be migrated to other machines when it is migrated.

Current limiting the drop

  • When the system traffic exceeds the load bearing capacity, you can handle it only by limiting traffic or degrading functions.

Highly available architecture

  • The main reason is to avoid single point of failure.

High availability o&M

  • CI(Continuous integration)/CD(continuous deployment) in DevOps.
  • There should be a smooth software distribution pipeline that includes adequate automated testing, grayscale distribution, and automated control of online systems.

Reprint: tencent WeTest links: www.cnblogs.com/wetest/p/68… Source: cnblogs

blog

Technology Github learning address: github.com/codeGoogler…

Programming books for programmers: github.com/codeGoogler…

On how to learn Java, on the one hand, we need to continue to learn, to learn the basic knowledge of solid, on the other hand, we should also realize that Java learning can not only rely on theory, more rely on practical operation, so we should practice more projects, learning in practice is the best way to learn. Many people don’t know how to learn at first. Here I have compiled some important technical articles into the Github open source project, hoping to give you some help. The project is: JavaCodeHub

In addition, I also sorted out some programming books for programmers and put them on Github. The projects are: ProgramBooks, if necessary, you can get them by yourself. Address: github.com/codeGoogler…

If Github access is too slow? I also put ProgramBooks on the code cloud

Finally, we will continue to use our official account “Terminal R&d Department” to recommend a high-quality technology-related article every day, mainly sharing Java related technology and interview skills. Our goal is to know what it is, why, lay a solid foundation, and do everything well! The main technology of the public super worth our attention.