directory

(1) Monolithic architecture

(2) Preliminary high availability architecture

(3) Pressure estimation of tens of millions of users

(4) Server pressure estimation

(5) Business vertical separation

(6) Use distributed cache to resist down read requests

(7) Do read and write separation based on database master-slave architecture

(8) Summary

This paper will start from the development process of a large website, and explore step by step how the architecture of this website evolves from single architecture to distributed architecture, and then to highly concurrent architecture.

1. Monolithic architecture

Generally, when a website is just established, the number of users is very small, probably tens of thousands or hundreds of thousands of users, and the number of active users every day may be hundreds or thousands.

This time the general website architecture is designed using a single architecture, a total of 3 servers deployed, 1 application server, 1 database server, 1 picture server.

The development team is usually no more than 10 people, and you write code in a single piece of application, and then you combine the code, and then you publish it online to the application server. Most likely, manually turn Tomcat off on the application server, replace the system’s code WAR package, and then restart Tomcat.

The database is generally deployed on an independent server to store all the core data of the website.

Then deploy NFS on a separate server as the image server to hold all the images of the site. The code on the application server connects to and manipulates the database and the image server. As shown below:

2. Initial high availability architecture

But in this pure monolithic system architecture, there are high availability problems, the biggest problem is that the application server may fail, or the database may fail

So in this period, the average company with a slightly larger budget will do a preliminary high availability architecture.

Application servers are typically clustered. Of course, the so-called clustering deployment, when the initial number of users is very small, in fact, it usually means deploying two application servers, and then deploying a server to deploy a load balancing device, such as LVS, to evenly distribute user requests to the two application servers.

If one application server fails, another application server is available, thus avoiding the single point of failure. As shown below:

For the database server, the master/slave architecture is generally used at this time, and a slave library is deployed to synchronize data from the master library. In this way, once the master library has problems, the slave library can be quickly used to continue to provide database services, so as to avoid the complete failure of the whole system caused by database failure. The diagram below:

Iii. Pressure estimation of tens of millions of users

This assumes that the estimated number of users of this site is 10 million, so according to the rule of 28, the number of people who visit this site every day is 20%, or 2 million people visit this site every day.

It’s generally assumed that the average user gets 30 hits per visit, so that’s a total of 60 million hits (PV).

24 hours a day, according to the rule of 28, the most active time of most users is concentrated in (24 hours * 0.2) ≈ 5 hours, and most users refer to (60 million clicks * 0.8 ≈ 50 million clicks)

That’s 50 million hits in five hours.

That translates to around 3,000 requests per second during that five-hour active visit period, followed by peak periods of heavy user concentration.

For example, a large influx of users in a concentrated half hour forms a peak visit. Based on online experience, the average peak visit is two to three times the active visit. Let’s say we do this by a factor of three, so there might be a brief spike within five hours of 10,000 requests per second.

Iv. Server pressure estimation

With an idea that peak traffic is likely to be around 10,000 requests per second, take a look at the pressure estimates for each server in your system.

Generally speaking, a virtual machine deployed application server with a Tomcat on it can support up to several hundred requests per second.

At 500 requests per second, 20 applications would need to be deployed to support peak traffic of 10,000 visits per second.

Moreover, the number of visits to the database by the application server is several times larger, because assuming that the application server receives 10,000 requests a second, the application server may involve an average of three to five database accesses to process each request.

Based on three database accesses, that’s 30,000 requests to the database per second.

One database server supports a maximum of 5000 requests per second. In this case, six database servers are required to support 30,000 requests per second.

The image server will also be under a lot of pressure because of the need to read a large number of image display pages. This is not easy to estimate, but it can be estimated that there will be at least thousands of requests per second. Therefore, multiple image servers are needed to support the image access requests.

Five, business vertical split

Usually the first thing to do at this stage is a vertical split of the business

Because if all the business code is mixed and deployed together, it becomes difficult to maintain when multiple people collaborate. When the website reaches tens of millions of users, the R&D team usually has dozens or even hundreds of people.

Therefore, it is a very painful thing to develop in a single block system. What is needed at this time is the vertical separation of the business. A single block system is divided into multiple business systems, and then a small team of about 10 people is responsible for the maintenance of a business system. The following figure

Distributed cache carries down read requests

At this point, the application server level is generally not a big problem, as simply adding machines can withstand higher concurrent requests.

Now it’s estimated to be around 10,000 requests per second, and deploying 20 or 30 machines is no problem.

However, the biggest pressure in the system architecture mentioned above is actually at the database level, because it is estimated that there may be about 30,000 concurrent requests for reading and writing to the database at peak times.

At this point, it is necessary to introduce distributed cache to resist the pressure of database read requests, that is, the introduction of Redis cluster.

Generally speaking, read and write requests to the database follow the rule of 28, so of the 30,000 read and write requests per second, about 24,000 are read requests

Almost 90% of these read requests can be resisted by the distributed cache cluster, which means about 20,000 read requests can be resisted by the Redis cluster.

We can put a copy of hot and common data in Redis cluster as cache, and then provide cache service externally.

When reading data, it is read from the cache first. If it is not in the cache, it is read from the database. So 20,000 read requests go to Redis and 10,000 read and write requests continue to go to the database.

Redis generally a single server against tens of thousands of requests per second is no problem, so the Redis cluster is generally deployed on 3 machines, resist 20,000 read requests per second is absolutely no problem. As shown below:

7. Do read and write separation based on master/slave database architecture

At this point, the database server still has 10,000 requests per second, which is still too much pressure for a single server.

However, databases generally support a master-slave architecture, where one slave is always synchronizing data from the master. At this point, read/write separation can be done based on the master/slave architecture.

That is, about 6000 write requests per second are going into the master library, and about 4000 read requests are going into the secondary library, so you can split the 10000 read/write requests between the two servers.

After this allocation, 6000 write requests per second for the master and 4000 read requests per second for the slave are at most manageable. The diagram below:

Eight, summary

This paper mainly discusses the high concurrency architecture design of large websites under the scenario of tens of millions of users, which is to estimate the access pressure of tens of millions of users and the corresponding background system in order to resist high concurrency, the architecture design of several levels of business system, cache and database and the analysis of high concurrency resistance.

Architecture but want to remember, large-scale web site of the communist party of China involves technology is far more than these, also includes the MQ, CDN, static, depots table, no, search, distributed file systems, reverse proxy, and so on many topics, but this article is not involved, mainly in the high concurrency this Angle analyse how the system under the resistance of thousands of requests per second.