First, we need to understand two questions:

  1. Why multithreading?
  2. What are the application scenarios of multithreading?

Why does It use multithreading?

Using multithreading is essentially about improving program performance. So what are the performance metrics?

  1. Latency: Latency is the time between sending a request and receiving a response. The shorter the latency, the faster the program executes and the better its performance.
  2. Throughput: Throughput refers to the number of requests that can be processed per unit of time. Higher throughput means that the program can handle more requests and better performance.

By improved performance, we mean lower latency and higher throughput. To do this, you need to understand multithreading scenarios.

Application scenarios of Multithreading

In the world of concurrent programming, improving performance is essentially improving hardware utilization, which translates to improving I/O and CPU utilization. Operating system to solve the problem of hardware utilization is often a single hardware device, and concurrent programs often need CPU and I/O devices to work together, so we need to consider the overall utilization of CPU and I/O devices.

Let’s look at how multithreading can improve CPU and I/O utilization. Assume that the program is run as a CPU calculation and I/O operation are executed alternately, and that the CPU calculation and I/O operation take 1:1 time.

As shown in the figure, when there is only one thread performing CPU calculations, the I/O device is idle. The CPU is idle during I/O operations. Procedure So they’re both 50% utilization.



If another thread is added, as shown in the figure, when thread A performs CPU calculation, thread B performs I/O operations. When thread B performs CPU calculations, thread A performs I/O operations. This achieves 100% UTILIZATION of CPU and I/O devices.



We just increased both CPU and I/O device utilization to 100% and doubled the number of requests per unit time, which means throughput doubled. Think about it the other way around:If CPU and I/O device utilization is low, you can try to increase throughput by adding threads.

In the era of single-core CPUS, multithreading was mainly about balancing cpus and I/O devices. If the program only has CPU computation and no I/O operations, multithreading will not improve performance but decrease it, because multithreading increases the cost of thread switching. However, in the multicore era, this purely computational scenario can improve performance by using multiple threads, because multiple cores reduce response times.

For example: calculate 1+2+… +10 billion, if executed with A 4-core CPU program, A computations (125 billion), B computations (2.5 billion, 5 billion), C computations (5 billion, 7.5 billion), D computations (7.5 billion, 10 billion), and finally summed up. This would theoretically be about four times faster than a single thread calculation [110 billion], reducing the response time to 25%. One thread, for a 4-core CPU, is only 25% utilization, while four threads can increase CPU utilization to 100%.

                 

How many threads does It create?

How many threads are appropriate depends on the context in which multithreading is used. There are two general cases:

  1. I/O intensive
  2. cpu-intensive

The method of calculating the optimal number of threads is different for I/O intensive and CPU intensive programs. Let’s break it down.

♠ CPU intensive

For CPU-intensive computing, multi-threading essentially improves CPU utilization. Therefore, for a 4-core CPU, each core has one thread. Theoretically, it is ok to create 4 threads, and creating more threads will only increase thread switching costs. Therefore, for CPU-intensive computing scenarios, “number of threads = number of CPU cores” is ideally appropriate. In practice, however, the number of threads is usually set to “CPU cores + 1”, so that if the thread is blocked due to the occasional memory page failure or other reasons, the extra thread can take over, thus ensuring CPU utilization.

♠ I/O intensive

For I/O intensive computing scenarios, such as the previous example, if the CPU computation and I/O operation time is 1:1, then two threads is most appropriate. If the CPU computation and I/O operation takes 1:2, how many threads are appropriate? There are three threads, as shown in the figure: THE CPU switches between threads A, B, and C. For thread A, when the CPU switches back from threads B and C, thread A just completes the I/O operation. This achieves 100% utilization of both the CPU and I/O devices.



From the example above, we found that for I/O intensive computing scenarios, the optimal number of threads is related to the ratio of CPU computation to I/O operation time in the program.

A formula can be summed up:Optimal number of threads = 1 + (I/O time/CPU time)

Let R = I/O time/CPU time. Based on the figure above, it can be understood as: when thread A performs I/O operations, other R threads finish their CPU calculations. This achieves 100% CPU utilization.

For multi-core cpus, the ratio can be enlarged: Optimal number of threads = number of CPU cores * [1 + (I/O time/CPU time)]

If you find something, give it a thumbs up