For QPS, RT these terms must be familiar to everyone, but when it comes to how to improve they are at a loss. Let’s look into it today
directory
The name of the interpretation
The relationship between QPS and number of threads
Optimal number of threads
case
To optimize the direction
Relationship between QPS and RT
conclusion
Noun explanation
RT: Response Time (RT): Time when a request is completed
QPS(Query Per Second): the number of requests completed within 1 Second
The relationship between QPS and number of threads
For a single thread, QPS = 1000ms/RT
For example, a system with only one thread and a response time of 50ms would have a QPS of 1000/50=20
If it has two threads, its QPS is: 20*2=40
This assumes that CPU, IO, and memory are not affected
Theoretically, the more threads a server can support, the higher the QPS, which is directly proportional to the number of threads.
Of course, the resources of the server are limited. In the actual compression process, the QPS will increase as the number of threads increases at the beginning. When the number of threads reaches a certain number and the CPU bottleneck is reached, the QPS will remain the same.
Reason is very simple, when the CPU resources abundant, thread has enough CPU execution time is used to run the program, when the number of threads to achieve a certain number of CPU resources run out at the same time, the thread start competing for CPU, frequent thread context switch (the process is very time consuming), between threads waiting for each other, natural increase response time.
Optimal number of threads
From the relationship between QPS and the number of threads, it is easy to get an idea.
Optimal number of threads: the critical number of threads that just consume server resources.
Formula: Optimal number of threads = ((thread wait time + thread CPU time)/ thread CPU time) * number of cpus
Equivalent to: Optimal number of threads = (thread wait time/thread CPU time + 1) * number of cpus
The first one is on the Internet, and the equivalent formula is what I converted, because it explains what it means
Of course, if you know the exact meaning of the first formula, please tell me
formula
In the single-thread case, the wait time of the thread is 100ms, and the CPU time of the thread is 20ms. In the single-thread case, the CPU time of the thread is 20ms, and the CPU time of the thread is idle for 100ms. 100/20 = 5, plus its own thread is 6, so the maximum number of threads that can be used by the CPU in this time period is 6. If the server has 2 cpus, then 6*2 = 12.
Apply the formula :(10/2 + 1)*2 = 12
Of course, in practice, it’s not 20ms for him or 20ms for me, but rather the CPU allocates time slices for each thread to execute alternately
features
When the optimal number of threads is reached, the number of threads continues to increase, but the QPS remains the same, while the response time becomes longer and the number of threads continues to increase, and the QPS begins to decrease.
How do I get the optimal number of threads
1. Slowly increase the number of threads through pressure measurement and observe the pressure measurement. It is easy to obtain the optimal number of threads according to the characteristics
2. It is a bit difficult to calculate directly from the formula because it is difficult to know the system’s thread CPU time and thread wait time
3, according to the improvement of the first way, a pressure test, observe the CPU situation, and then the number of threads *(CPU expected value/current CPU value), will get a approximate value, and then a little adjustment can get the best number of threads.
case
In order to better understand the above theories and discuss how to improve QPS, we build a test case through Springboot
Define a method to simulate CPU execution
public long runCpu(int count){
long start = System.currentTimeMillis();
// Make the CPU run with a few arguments
int a = 0;
double b = 0;
long c = 0;
for (int i = 0; i < count; i++) {
for (int j = 0; j < 100; j++){ a++; b++; c++; a=a*2; b=b/2;
a=a/2; b=b*2;
c=c*2; c=c/2; a--; b--; c--; } a++; b++; c++; } System.out.println(a);// Return the running time
return System.currentTimeMillis() - start;
}
Copy the code
The count parameter makes the method run time variable
Define the pressure test interface
/ * * *@paramCount Number of cycles used to simulate CPU runtime *@paramSleep IO time milliseconds */
@GetMapping("/benchmark")
public String qps(int count, long sleep) throws InterruptedException {
long start = System.currentTimeMillis();
// CPU run time
long cpuTime = runCpu(count);
long ioStart = System.currentTimeMillis();
// Simulate I/O blocking
Thread.sleep(sleep);
long ioTime = System.currentTimeMillis() - ioStart;
long total = System.currentTimeMillis() - start;
return "total: "+ total + " cpu-time:" + cpuTime + " io-time:" + ioTime;
}
Copy the code
For testing purposes, I made it a mirror and ran it using Docker
Here is my Docker-compose file, given 2 cpus
version: '3.5'
services:
qps-test:
image: QPS - test: 1.0.0
container_name: qps-test
ports:
- 8080: 8080
resources:
limits:
cpus: '2.00'
Copy the code
For the first test, set count to 100000 (which is equivalent to my machine’s CPU-time of 10-20ms) and IO time to 80ms
http://192.168.65.206:8080/qps/benchmark?count=100000&sleep=80
The results are as follows
RT | qps | cpu | Optimal number of threads |
---|---|---|---|
103 | 125 | 190% | 13 |
Single-threaded QPS: 1000/103 = 9.7
There may be some partners who do not know how to call up this result, here I simply explain next
First we need to know, the server bottleneck on the CPU, because I this case there can be no memory bottleneck, so we need to press the CPU to 190% (critical CPU bottleneck) after left, if pressure to 200%, it is likely that at this time the number of threads has super, CPU resources have been exhausted, it need to reduce the number of threads, if not to 190%, Continue to increase the pressure gauge thread until it is constant at about 190%.
I use Jmeter for pressure measurement tools
With this baseline data, now is the time to try to improve QPS
To optimize the direction
According to the formula: QPS = (1000/RT) * number of threads
Since CPU resources are running low, we can only try to reduce the response time
Response time is divided into two parts: CPU time and thread wait time, so we start with both.
Reduces THE I/O wait time
We tried to actually reduce IO from 80ms to 40ms
http://192.168.65.206:8080/qps/benchmark?count=100000&sleep=40
The results of pressure measurement are as follows:
RT | qps | cpu | Optimal number of threads |
---|---|---|---|
65 | 123 | 190% | 8 |
Single-threaded QPS: 1000/65 = 15.4
We found that the response time went from 103 to 65, but the QPS barely changed, and the optimal number of threads went from 13 to 8
Conclusion: reducing IO time does not improve QPS, why?
We follow the CPU resource constancy principle: CPU resource = CPU time of a thread * total number of threads * QPS per thread
So the formula is as follows: CPU processing time per second for baseline data = CPU processing time per second for reducing I/O wait time
23ms * 13 * 9.7 = 25ms * x * 15.4 x = 7.53
25ms is RT(65) -IO (40) 15.4 is 1000/65
The number of threads | Single thread QPS | RT | CPU processing time | QPS |
---|---|---|---|---|
13 | 9.7 | 103 | 23ms * 13 * 9.7 | 125 |
8 x material | 15.4 | 65 | 25ms * x * 15.4 | 123 |
Reduces CPU execution time
We cut the CPU run time by average, count value 100000 -> 50000
http://192.168.65.206:8080/qps/benchmark?count=50000&sleep=80
The results of pressure measurement are as follows:
RT | qps | cpu | Optimal number of threads |
---|---|---|---|
101 | 244 | 190% | 25 |
Single-threaded QPS: 1000/101 = 9.9
The response time barely changed, but the QPS doubled and the number of optimal threads doubled
It is concluded that reducing CPU time can significantly improve QPS
Also according to the principle of constant CPU resource:
23ms * 13 * 9.7 = 21ms * x 9.9
X material 14
For some unknown reason, the car rolled over here, and normally the CPU time should be around 10ms, because count has been halved.
If the CPU time is between 10ms and 15ms, the X is close to 25 and fits the manometry. Here the guess is because the IO time is wrong, causing RT to become longer.
summary
count | sleep | RT | qps | cpu | Optimal number of threads |
---|---|---|---|---|---|
100000 | 80ms | 103 | 125 | 190% | 13 |
100000 | 40ms | 65 | 123 | 190% | 8 |
50000 | 80ms | 101 | 244 | 190% | 25 |
Relationship between QPS and RT
If we simply follow the formula: QPS = 1000/RT, the relationship between QPS and RT is as follows
However, through the case, we know that in actual situation, the relationship between QPS and RT is not like this. There are two kinds of time in RT that affect QPS.
CPU execution time is reduced and QPS is significantly improved.
The IO wait time decreases, and the QPS is not improved significantly or has no improvement.
conclusion
Through the above analysis, if you want to improve RT
1. Reduce I/O response time
2. Reduce CPU execution time
If you want to improve your QPS
1. Reduce CPU execution time
2. Increase the number of cpus
Tip: If the QPS peaks before the CPU reaches the bottleneck during the pressure test, then there is another bottleneck, such as memory
The resources
https://www.docin.com/p-73662763.html?docfrom=rrela
Copy the code
Chase more, want to know more exciting content, welcome to pay attention to the public number: programmer PURPLE
Personal blog space: zijiancode.cn
If my article is helpful to you, please like and forward it. Your support is my motivation to update. Thank you very much!