Introduction to Distributed Architecture

Distributed system

Before we look at distributed architecture, let’s look at distributed systems. According to Wikipedia, a distributed system is a group of computers connected through a network to transmit messages and communications and coordinate their behavior. Components interact with each other to achieve a common goal. The science of dividing engineering data that requires a lot of calculation into small pieces, calculating them separately by multiple computers, uploading the calculation results, and unifying the results to draw data conclusions.

This is probably an architecture where multiple components are delivered to each other to complete external services.

So what are the benefits of such a system? Why do we need such a system?

Distributed System predecessor

The predecessor of distributed system is centralized system. It has a centralized node, which may be composed of one or more machines. All data storage and calculation are completed on the host. Advantages of centralized system: simple deployment, high reliability, strong data consistency and so on.

With the continuous growth of customers and transactions, the performance of the centralized architecture system built on the mainframe is becoming increasingly tight. At this time, the means can be adopted to improve the hardware configuration, such as adding memory, expanding disk, upgrading CPU and so on. This kind of scaling is called scale-up.

But hardware does not scale unconditionally, and after scaling up to a certain point, the improvements to the system will be very limited. And the price of system hardware is rising. With the development of computers, such architecture becomes more and more difficult to meet people’s needs. For example:

Because of the complexity of mainframes, it is very expensive to train a person who can operate and maintain them skillfully
Mainframes are expensive and usually only the tuhao (government, finance, telecom) can afford them
Single point of failure, when a large host fails, the entire system becomes unusable. And for mainframe users, the loss caused by this unavailability is very large
Technology is advancing. Technology is advancing. PC performance continues to improve, many enterprises abandon the mainframe to use minicomputers and ordinary PC to build system architecture

The meaning of distributed architecture

The cost performance of upgrading single machine processing capacity is getting lower and lower
- The processing capacity of a single machine mainly depends on CPU, memory and disk. It becomes more and more expensive to change hardware and scale vertically to improve performance.
There is a bottleneck in the single-node processing capability
- There are bottlenecks in stand-alone processing capacity, CPU, memory will have their own performance bottlenecks, that is to say, even if you are rich at cost to improve the hardware, but the development speed and performance of the hardware is limited.
Stability and availability are two metrics that are hard to achieve
- Standalone systems have availability and stability issues, these two indicators we have to solve

Evolution of distributed architecture

A mature large-scale application system architecture is not designed perfectly at the beginning, nor is it equipped with high performance, high availability, security and other features at the beginning, and gradually improved with the increase of business data volume. In this process, the development model, technical architecture will be relatively big changes. The system for different businesses will have their own focus, for example, the website of e-commerce platform, to solve the massive commodity search, order, payment and other problems. Communication software needs to deal with real-time messaging for hundreds of millions of users. What search engine wants to solve is the search of massive data. Each kind of business has its own different system architecture.

Here, a javaweb e-commerce application is used to simulate the evolution of an architecture. The simulation focuses on data and structural changes as traffic increases, not specific business points.

Suppose our system has the following functions:

User functions: User registration and management.
Product function: product display and management.
Trading function: create trading and payment settlement.

Single application Architecture

In the early days of the web (or the early days of the Internet), we often deployed our programs and software on a single machine.

Deploying all the applications on the same machine creates a simple system.

Application services and databases are separated

As the website goes online, the traffic gradually increases, and the load on the server becomes higher and higher. There is no way to improve the optimization at the code level. It is a good way to increase the machine without improving the hardware. The input-output ratio is very high. This stage is mainly about the separation of database and application services, which not only improves the load capacity of the single machine, but also improves the disaster recovery capacity.

Application Server Cluster

Continue to increase with the number of visits. A single machine is no longer sufficient. Assuming the test database has not yet hit a bottleneck, we can continue to add machines as we did in the previous phase. Add application servers to distribute user requests to each server to increase load capacity. Before testing multiple application servers, there is no direct interaction.

At this stage of the architecture, various issues begin to surface:

How do user requests verify that they are forwarded to which application server, and by whom?
How to maintain sessions when users access different application servers?

Reading and writing separation

By clustering application services, the performance of the application layer is pulled up. But as the volume of business increased, the database became the bottleneck. How can you increase the load at the database level? With the experience above, you will naturally want to add servers. However, if we simply add a single database server and the subsequent requests are loaded to different database servers, we will definitely have database data inconsistency problems (write requests will also be loaded to different database servers). Therefore, in this case, read and write separation is usually used first.At this stage, most of the problems encountered are database problems:

How to ensure data synchronization between master and slave database; The master-slave of mysql is used to implement master/slave replication.
How to select the corresponding data source; Through third-party plug-ins such as MyCat, ShardingJDBC;

Introduce caching mechanism to reduce database stress

As the number of visits increases, more and more users will be accessing the same part of the content. For this part of hot data, if you go to the database every time, it will bring great pressure to read the library. This is where an environment mechanism, such as Redis, can be introduced. It is used as the cache for applications. When a request is read, the data in the cache can be viewed first. If the request is not queried, the database can be queried again.

Use search engines to relieve database stress

Database do read library, fuzzy search efficiency is not particularly good, like e-commerce sites, search is a very core function, even if it is done to read and write separation, this problem can not be effectively solved. This is where we need to introduce a search engine, which can greatly improve our query speed, but also bring some additional problems, such as maintaining index builds.

Horizontal/vertical split of the database

At this point, the user, item, and transaction data remain in the same repository, albeit in a form of caching and read-write separation. However, as the volume of data increases, the database is still a bottleneck, so consider splitting the database:

Vertical split: Split different business data in the database into different libraries
Horizontal split: Split the data in the same table into several tables. The reason for horizontal split is that some business volume has reached the bottleneck of a single database.

Application of split

As businesses grow and become more complex, the pressure on applications increases. At this time, we can consider splitting the application into multiple subsystems according to different functions. (Although users and products are clearly identified in the figure above, this is only to show the functions of the app. In fact, there was only one APP before this.)

Here we come to the microservices architecture, but there are a lot of problems with this architecture:

How do multiple services communicate with each other after service separation?
If multiple nodes are deployed for the same service, how can I ensure node configuration consistency?
After services are split, the same request may fall on nodes A and B, making it inconvenient to view logs.
When multiple nodes run scheduled service tasks at the same time, how can I ensure that the tasks are not repeatedly executed or missed?
Distributed consistency issues.

These questions will be addressed in a follow-up article.

conclusion

We transformed the centralized architecture into a distributed architecture. As the business evolves, a centralized architecture becomes slower and more expensive to upgrade. Stability and availability are difficult to guarantee. We wanted a high performance and high availability architecture, so we moved to a distributed architecture.