At 11am on November 11, 2018, I came home tired and ready for a good sleep to wash off the fatigue of working all night. It occurred to me that I had told my girlfriend to help me rob things. I don’t know what happened.


QPS, TR, the number of concurrent users, the optimal number of threads, and so on are all concepts related to the system’s handling of concurrency. Can be used to measure the availability of a system and other indicators.

RT

Response Time refers to the Time elapsed from the Time when the client sends a request to the Time when the client receives the Response from the server.

When we evaluate a website’s “fast” and “slow”, we are actually talking about its RT time is long and short. When we visit a website, sometimes we say that the website is “stuck”, but what we really mean is that the website has a long RT.

If the RT of a site is very long, it can particularly affect the user experience. So RT is a very important indicator. Each website also needs to focus on optimization.

An example of an amusement park might be easier to understand. For example, when we go to Disneyland, most of the items need to wait in line.

To let visitors know how long they have to wait in line for an item, Disney does a lot of things, like they have an App that indicates the estimated line time for each item. There is also a small board in the queue for each event with the estimated queue time.

However, this time is not set out in a vacuum, but “calculated”.

Disney’s line time calculation method:

1. Disney has staff at the entrance and exit of each project. 2. The staff at the entrance randomly looks for tourists and gives them a piece of paper, which records the time when they start queuing. 3. The staff at the entrance reminds the tourists that they should return the slip to the staff at the exit after sightseeing. 4. After receiving the tourists’ small paper, the staff at the exit will use: current time – the time when tourists start queuing = queuing length. 5. In order to make the data as accurate as possible, multiple queuing times are generally collected and an average value is calculated.

That’s how Disney calculates line times. And actually, that’s how YOU calculate RT. The time is recorded at the beginning of a request, and the time is recorded at the end of a request. The difference between the two times is RT.

RT of a Disney project includes multiple time periods: queuing time, listening to the project presentation time, project preparation time, project play time, etc.

Server response time is also composed of multiple parts, generally including: request sending time, network transmission time and server processing time.

QPS

QPS refers to the number of queries Per Second that a system can process. In Web applications, we pay more attention to the number of requests that a Web application can process Per Second. This is an important indicator to measure system performance. Sometimes we call it throughput.

QPS and RT almost always come in pairs. When we evaluate a Disney project, there are several criteria: how fun it is, how long it lasts, and how many people it can accommodate.

How many people can be accommodated at the same time can be simply understood as QPS. To a large extent, the number of people a project can accommodate at the same time will greatly affect the duration of visitors.

Therefore, there is a certain relationship between QPS and RT:

RT= Concurrency /QPS QPS= Concurrency /RTCopy the code


Although the above equation looks like, in the case of a certain number of concurrent, the only way to improve QPS is to reduce RT. But in fact, it is not, the above is just the calculation method of QPS. There are many ways to improve YOUR QPS.

For example, if you want to improve the throughput of amusement facilities, the first way to think of is to upgrade equipment, such as increasing the area of amusement venues, increasing the number of seats, increasing the number of queues, etc.

In the computer system, want to improve QPS, mainly in the CPU, memory and other hardware efforts, such as improving CPU utilization, increase the number of CPU, improve memory, etc..

Similar to QPS, there is also the concept of TPS, which will not be expanded here. Throughput mentioned in this article refers to QPS and TPS in general, without making a detailed distinction.

Number of concurrent users

The number of concurrent users is the number of people who queue up for a project at the same time.

There are two common misconceptions about the number of concurrent users.

One erroneous view is to interpret the number of concurrent users as the total number of users using the system; (For example, Disney’s Leap Horizon might have 500,000 users a day. We can’t say that 500,000 is the number of concurrent users.)

Another misconception is to interpret the number of online users as the number of concurrent users. (For example, at 6pm, Disney’s “Leap the Horizon” had 10,000 people in line plus viewing, so we can’t say that 10,000 is the number of concurrent users.)

The number of concurrent users is defined as the number of online users interacting with the server at the same time. (Let’s say at 6 p.m., 8,000 people were waiting in line to use Horizon Jump. This is the number of concurrent users.

Take the system, we say the number of concurrent users of Taobao details page, in fact, it is the number of users who request to view details page at the same time. Some users may also view the details page, but it is not interacting with the system at the concurrent time, so it does not count.

Optimal number of threads

The optimal number of threads refers to the maximum number of people a project can hold, which can include the number of people queuing.

Every time Disney opens a new venue or game project, it’s a trial run. In the trial operation phase, the operation of the whole venue or project is observed by constantly adjusting the number of concurrent users.

In addition to launching new venues and programs, some will also conduct similar experiments before the holidays.

This is very similar to manometry in computer software. Is to continuously increase the number of requests, to observe the system QPS and other indicators of the system, such as CPU, memory, etc.

In the case of performance pressure measurement, QPS will increase with the increase of users at the beginning and has little impact on CPU, etc. When it reaches a certain threshold, QPS will not increase with the increase of users, or the increase is not obvious. Meanwhile, CPU Load will soar and memory occupation will be large. This is followed by a significant increase in response time with requests. This threshold is what we consider the optimal number of threads.

If the number of concurrent requests exceeds the optimal number of threads in the system, then fierce competition for resources will result, and the whole system will face disaster as resources become scarce or even exhausted.

Say that finish, or a girlfriend, I gave her a link (http://www.techug.com/post/10-tips-of-web-app-performance.html), and then began to sleep down on the bed.

However, in between sleeping and waking up, I seem to hear my girlfriend is still complaining: why some people can change the address, but I can not change it?

Next time, I know, I’ll be lecturing her about fuses, limiting current, and downgrading.