On December 21, 2017, Zhai Yongdong, senior architect of Cloud Technology, delivered a speech titled “How to Achieve High Performance and high availability of Architecture on cloud” in “Building enterprise Architecture in cloud Era”. As the exclusive video partner, IT mogul Said (wechat ID: Itdakashuo) is authorized to release the video through the review and approval of the host and the speaker.

Read the word count: 2851 | 8 minutes to read

Guest Speech video and PPT review:
suo.im/4sKQd8


Abstract

Cloud architecture needs to pay attention to many factors, this time mainly talks about high availability and high performance, from these two aspects of in-depth analysis of how to build a perfect cloud architecture.

Overview of architecture on the cloud

Not only performance and availability, but also security, manageability, and resiliency should be taken into account when building an architecture on the cloud. In practice, every link should be taken into account.

Compared with the traditional architecture design method on the cloud, the traditional architecture design cycle is relatively long. The general enterprise architecture will consider the planning of the next three to five years, mainly to solve the problem of whether there is no, from 0 to 1 architecture. The architecture design cycle on the cloud is relatively short, the requirements are clear and the focus is on solving or optimizing existing problems.

High-performance architecture on the cloud

What is performance

Performance is difficult to measure. In the narrow sense, performance refers to the speed of operation, while in the broad sense, performance involves more content, such as power consumption, utilization, performance-price ratio, speed, etc. Different perspectives pay different attention to performance. From the user perspective, the whole time interval between sending a request and receiving a response is concerned. The longer the time, the worse the performance is for the user. From an architectural and developer perspective, it is more about response latency, system throughput, and concurrent processing capabilities than it is about understanding the root cause of user feedback.

Basic steps for high-performance architecture design

There are four basic steps to building a high-performance architecture. The first step is to define the performance goal, then analyze all the problems in the system that affect the achievement of the goal, identify the problems and address them, and finally test the current performance metrics by means of performance evaluation. If the evaluation result is different from the previous performance goal, it indicates that the problems affecting performance have not been found. In this case, you need to restart the previous steps.

The whole process is actually a cycle, even if one assessment can be achieved once, new performance needs will emerge as the business evolves over time.

Further analysis

The performance target refers to the established performance indicators, such as page response time less than 1 second, concurrent users can reach 10,000, peak processing of 10,000 user requests per second, etc.

Then, based on the performance goals, analyze the problems that affect performance indicators at different layers of the current service system, such as bandwidth and latency at the network layer, Cpu processing capacity at the computing layer, clustering or not, and other factors. So the system performance is determined by the overall processing capacity, not a single factor.

After analyzing the problem, you can start to solve it from two aspects. On the one hand, it is the easiest and the first thing most people will think of, that is, to improve the system hardware configuration, if the upgrade of hardware resources can solve the problem, then directly adopt this method, its biggest advantage is that there is no need to do any modification to the existing code logic. However, in most cases, all problems cannot be solved simply through hardware upgrade. It is also necessary to start from the architecture level, reduce the server pressure, and use extensible architecture to improve performance.

Traditional testing can be done using tools like LoadRunner, while ali Cloud performance testing service PTS can be used on the cloud. The biggest difference between PTS and traditional performance tests is that LoadRunner needs to build it by himself. At the same time, the test system built will be limited to its service ceiling, and the number of servers determines the test pressure that can be simulated. PTS can quickly simulate a large number of concurrent requests because it is in the cloud so the PTS back end can simulate the concurrency required by the user in a clustered manner.

The diagram above shows a relatively good architecture that we propose. The front-end load balancing service responds to user requests, and there is a front-end cache before forwarding the requests to specific servers on the back-end to improve response time and reduce back-end stress. Back-end servers respond to user requests in cluster mode, and applications interact asynchronously. Respond to the request through the cache before accessing the database, and then access the database when the database cannot be hit.

One particular problem with caching is that the cache is inconsistent with the data in the database. The solution to this problem is different, according to different needs to choose. One way, for example, is to update or invalidate the cache while writing to the database, so that when the user reads, they either get the latest data or have to re-read the data from the database.

A customer’s high-performance architecture on Ali Cloud


The picture above is an on-cloud architecture for one of our customers. Front-end user requests are responded by CDN service. CDN is mainly used for service acceleration. For satisfying responses, CDN is directly used to solve, and unsatisfying requests are forwarded to the back-end SLB.

As you can see from the figure, different applications use different number of servers. Here, all services are deployed on the ECS, which is mounted behind the SLB. In addition, there is an OCS data cache, where the data requested by the user is read from the database if it cannot be retrieved from the cache.

The design of the database is also very complex. First, it implements a set of read and write separation. Second, there is a DRDS distributed relational database, which can mount multiple RDS instances.

First, DRDS must be used in conjunction with RDS. DRDS itself does not store data. Data is stored on RDS. The RDS instance after the second DRDS must be a Mysql database; The third DRDS can be used in two ways, one is to split the table and the other is to not split the table. If the table is not split, the DRDS will save the table to an RDS instance.

High availability of architectures on the cloud

Definition of high availability

High availability literally means reducing downtime and keeping services highly available. Automatic detection, automatic switchover, and automatic recovery are the first requirements for high availability.

Automatic detection: detect the operation through redundant detection and record the collected information for maintenance reference.

Automatic switchover: Confirm that the other party is faulty, then the normal host replaces the faulty host.

Automatic recovery: After the faulty host is restored, the system automatically switches to the restored host.

The premise of highly available design

During the HA design, you are advised to upgrade the architecture in a hierarchical and modular manner. The HA design is based on the application layer and infrastructure layer, and modules are loosely coupled based on their functions. The modules are stable, reliable, easy to expand, and easy to maintain.

Highly available design approach

There are three high availability design modes: active/standby mode, where the host is working and the standby server is in monitoring preparation mode. Dual-machine mutual backup, the two hosts simultaneously run their respective services and monitor each other; Cluster work, where multiple hosts work together, each running one or more services.

High availability architecture design principles

– Assume failure design: assume that any link will go wrong, and then design backwards;


– Multiple availability zone design: avoid single point of failure in architecture as much as possible;


– Automatic expansion design: without design adjustment, it can meet the growth of business volume;


– Self-repair design: built-in fault tolerance and inspection capabilities, the application can self-repair and continue to work when some components fail;


– Loose coupling design: the smaller the coupling degree, the better the expansibility and fault tolerance

Multiple availability area design

Bind ECS of different availability zones under SLB instances to avoid unavailability of external services due to the failure of a single availability zone. RDS cloud databases in multiple availability zones can implement same-city data Dr. Data in OSS storage is stored in multiple availability zones by default.

Health checks repair themselves


If an UNHEALTHY ECS instance causes the number of healthy real instances to fall below the minimum value, elastic scaling automatically creates a healthy ECS instance to replace the unhealthy one.

Loosely coupled design

The original application is divided into independent modules through message decoupling. The influence between modules is small, so that the whole will not be unavailable due to partial failure.