High concurrency is when a lot of traffic (usually users) accesses the program’s interfaces, pages, and other resources at the same time. Solving high concurrency is to ensure the stability of the program when the traffic peaks.

We usually use QPS(queries per second, also called requests per second) to measure the overall performance of the program. The higher the number, the better, usually requires pressure measurement (AB tool) data.

Assuming that one of our processes (thread or coroutine) takes 50 milliseconds to process a request (the industry standard range is generally 20 to 60 milliseconds), then 20 requests can be processed in one second. A server can run many of these processes in parallel to process requests, such as 128. So the theoretical QPS of this machine is 2560.

Don’t underestimate this number. When your QPS is that high, it means that your DAILY active users are 2,560 *200= 512,000, which is usually calculated at a magnification of 200 times.

The maximum QPS a server can achieve is influenced by many factors, such as machine configuration, machine room location, CPU performance, memory size, disk performance, bandwidth size, program language, database performance, program architecture, etc. We will go into details.

1. Configure machine parameters

This is easy to understand, for example, the server can enable a maximum of 128 processes, you set a maximum of 100, this is server tuning.

2. Location of the equipment room

If you do overseas users, the server room should choose foreign, on the contrary, should choose domestic, because the machine room is closer to the user, the lower the time loss in transmission.

3. CPU performance

The better the CPU performance, the faster the processing speed, the more cores, the more processes can be started in parallel.

4. Memory size

The larger the memory, the more data the program can put directly into memory, and the data can be read from memory much faster than from disk.

5. Disk performance

This goes without saying. Solid-state drives perform much better than mechanical drives, and the better the performance, the faster you can read and write data.

6. Bandwidth

Generally, the bandwidth of the server refers to the outgoing bandwidth, in Mb/S. For example, the bandwidth is 8Mb/S, that is, 1MB/S. If the file download service is provided, a user’s download may use up the bandwidth of the server. Generally, resources such as pictures, videos, CSS files, and JavaScript scripts are stored in the THIRD-PARTY CDN and charged by traffic. In this way, the bandwidth of the server is not occupied.

If the user scale is small, basically one server is good. In this case, the fixed bandwidth size is generally selected.

If the number of users is large, a load balancer is used to distribute traffic to different servers according to certain rules. The load balancer generally charges users based on traffic.

If the average data size returned by a request is 50KB, the peak bandwidth required to reach 1000QPS =1000508/1024=390.625Mb/S.

For example, user_id can be simplified to UID. The purpose of file compression, such as images, videos, and CSS, is to reduce the size of data.

7. Programming language

Compiled languages generally perform better than interpreted languages, such as GO, which performs better than PHP. When languages are not replaced in the short term, high concurrency can be addressed by the heap machine.

8. Database performance

A database deployed on a server always has a bottleneck, such as queries per second, writes per second. We can solve the bottleneck of query (SELECT statement) by adding many slave libraries, which is called the multi-slave library model. It should be noted that there may be a delay in the synchronization of master and slave data. When the data needs to be queried immediately after modification, we need to set the mandatory reading from the master library.

We can split the business so that some tables are stored on one database instance and some tables are stored on other database instances. Although a database instance has its own bottleneck, the performance of many database instances can be greatly improved when stacked together. The solution of multiple database instances is called multi-master database model. The main purpose is to solve write bottlenecks (INSERT statements, UPDATE statements, DELETE statements).

If you have multiple master libraries and multiple slave libraries, you are implementing a multi-master, multi-slave model.

If a table to store the amount of data is very big, want to consider this time table (usually use middleware implementation), according to the time table or according to the user table, for example, when a table of all on the table in a database instance to meet the requirements, you should put some on the table is stored in the new database instance, In this case, the data of a table is distributed among different database instances. This is called a distributed database scheme, and you need to deal with complicated things, such as handling distributed transactions.

The number of concurrent connections to the database is also limited, we can use connection pool technology to deal with, that is, to maintain a certain number of connections and the database constantly open long connections, when the need to connect to the database from the pool to select a connection, use up put back, this is generally also used to achieve middleware.

Good indexes can also improve database performance, sometimes better than the multiple slave heap scheme.

If we can reduce the read and write of the database, it also indirectly improves the performance of the database. For example, we use Redis to do cache, and use message queues to asynchronously drop the database.

Sometimes some data takes a long time to be computed by the database, so metadata (the smallest granularity of data) can be taken and computed by the program, which is called memory for time.

9. Program architecture

For example, if a beginner programmer writes a program that needs to loop 100 times, and a senior programmer writes a program that needs to loop 10 times, the effect will not be the same.

conclusion

Generally, large projects are basically separated from the front and back, which is to run the page rendering process on the client side in terms of performance and reduce the pressure on the server. In terms of bandwidth, CSS, pictures, videos, JavaScript and other file resources that can be used by CDN should be used by CDN; those that can be compressed should be compressed as much as possible; and the size of returned data should be reduced as much as possible if the interface can be reduced.

To address programming language deficiencies or single-server bottlenecks, heap machines can be used first.

Indexes, multi-master, multi-slave, distributed databases, caches, connection pooling, message queues, etc., are easy to consider how to optimize performance from databases.

Sometimes the low coupling of the program is more important than the high performance of the program, do not blindly pursue high performance.

Pay attention and don’t get lost

All right, everybody, that’s all for this article. All the people here are talented. As I said before, there are many technical points in PHP, because there are too many, it is really difficult to write, you will not read too much after writing, so I have compiled it into PDF and document, if necessary

Click on the code: PHP+ “platform”

As long as you can guarantee your salary to rise a step (constantly updated)

I hope the above content can help you. Many PHPer will encounter some problems and bottlenecks when they are advanced, and they have no sense of direction when writing too many business codes. I have sorted out some information, including but not limited to: Distributed architecture, high scalability, high performance, high concurrency, server performance tuning, TP6, Laravel, YII2, Redis, Swoole, Swoft, Kafka, Mysql optimization, shell scripting, Docker, microservices, Nginx, etc