Ape lighthouse has welfare at the end!Copy the code

You have one thought, I have one thought, and when we exchange, one person has two thoughtsCopy the code






Why multithreading?

The best way to prevent concurrent programming errors is not to write concurrent programs



Since multithreaded programming is so error-prone, why has it persisted?

A: Of course. You must be good at something. You know, it’s very fast.

I agree with that answer, but it’s not specific enough

Where does concurrent programming fit in?

If the reason you chose multithreading is the word “fast”, the interview will not be so messy. Have you ever asked yourself

  1. Is concurrent programming fast in all scenarios?
  2. You know it’s fast. What’s fast? How do you measure it?

To get the answers to these two questions, we need an analysis process from qualitative to quantitative

Using multithreading is maximizing the speed of the program by setting up the right number of threads in the right scenario (I get the feeling you’re still not saying anything).

Translating this to the hardware level means taking full advantage of CPU and I/O utilization



If both are correct, the CPU and I/O can be maximized. The key is, how to do two [correct]?

We have to be professional when talking about specific scenarios. Give you two noun buff bonuses

  • CPU intensive program
  • I/O intensive programs

CPU intensive program

For a complete request, the I/O operation can be completed in a very short time, and the CPU still has a lot of computation to do, which means that the CPU takes up a large proportion of the computation

Let’s say we want to calculate 1+2+…. The sum of 10 billion, obviously, is a CPU intensive program

On a single-core CPU, if we create 4 threads to compute in segments, i.e. :

  1. Thread 1 calculation [125 million]
  2. . And so on
  3. Thread 4 computes [7.5 billion, 10 billion]

Let’s look at the picture below. What happens to them?



With a single-core CPU, all threads are waiting for the CPU time slice. Ideally, the sum of the execution time of four threads is the same as that of one thread five alone, and we’re actually ignoring the overhead of context switching for four threads

As a result, single-core cpus handle CPU-intensive programs, which is not a good fit for multithreading

What happens if you create the same four threads on a 4-core CPU and do the same piecewise calculation?



Each thread has a CPU to run, there is no waiting for CPU time slices, and there is no overhead of thread switching. In theory it’s four times more efficient

So, if you have a multi-core CPU processing CPU-intensive programs, you can maximize the number of CPU cores and apply concurrent programming to improve efficiency

I/O intensive programs

In contrast to CPU intensive programs, there are many I/O operations to be done after the CPU operation is completed for a complete request, which means that THE I/O operations account for a large proportion

We all know that the CPU is idle during I/O operations, so we want to maximize the use of the CPU, not idle

Also with a single-core CPU:



As you can see from the figure above, each thread is executing the same amount of CPU and I/O. If you multiply the figure by several cycles, the CPU operation is fixed. If you multiply the I/O operation by three times, you will find that the CPU is free again. To continue to maximize CPU utilization.

To sum up the above two situations, we can make a conclusion as follows:

The higher the proportion of thread waiting time, the more threads are needed. The higher the percentage of thread CPU time, the fewer threads are required.

At this point, you’ve probably seen the first case where multithreading is used correctly. How many threads are created correctly?

How many threads are appropriate to create?

If this question is asked in an interview, it is a test of your theory and practice. To get it right, you must be proficient in elementary school arithmetic

We have CPU intensive and I/O intensive scenarios, and different scenarios require different threads

How many threads are appropriate for CPU intensive programs?

Some of you have already noticed that for CPU-intensive applications, theoretically the number of threads should be equal to the number of CPU cores (logic), but in practice the number of threads should be set to the number of CPU cores (logic) + 1. Why?

Java Concurrent Programming In Action:

Computation-intensive threads are suspended at exactly the right moment because of a minor error or other reason, and there is just an “extra” thread to ensure that the CPU cycle does not break in this case.

So for CPU-intensive programs, the number of CPU cores (logic) + 1 thread is a good rule of thumb

How many threads are appropriate for an I/O intensive program?

I’ve asked you to plot a few more cycles (you can manually increase the ratio of I/O to CPU time by, say, 6x or 7x), so that you can come to the conclusion that for I/O intensive programs:

Optimal number of threads = (1/CPU usage) = 1 + (I/O time /CPU time)

I’m being thoughtful, and of course I’m worried that some of you might not understand this formula, so let’s manually substitute the ratio of the figure above into the formula above:



This is the optimal number of threads for a CPU core. If there are multiple cores, then the optimal number of threads for an I/ O-intensive program is:

Optimal number of threads = Number of CPU cores x (1/CPU usage) = Number of CPU cores x (1 + (I/O time /CPU time))

Now, some of you might be wondering, if you want to calculate an I/O intensive program, you need to know the CPU utilization, and if I don’t know that, how do I give you an initial value?

If I/O is the number of I/O cores, you can say 2N (N= number of CPU cores), or 2N + 1 (I/O cores). If you’re interested, you can see for yourself



In theory, in theory, in theory, you get 100% CPU utilization

If the theory worked, there would be no practice and no tuning. However, in the initial stage, we can really use this theory as a pseudo standard, after all, the difference may not be too much, so the tuning will be better

Having talked about the theory, let’s talk about the practical, I understand the formula (end of qualitative phase), but I have two questions:

  1. How do I know the specific I/O and CPU time?
  2. How do I check CPU utilization?

Yes, we need quantitative analysis

Fortunately, we are not the first to eat the first child, in fact, there are many APM (Application Performance Manager) tools can help us get accurate data, learn to use such tools, can combine the theory, in the process of tuning to get a better number of threads. I will simply list a few here. You need to research and choose which one to use and which application to use. Due to space limitation, I will not discuss it temporarily

  1. SkyWalking
  2. CAT
  3. zipkin

Now that you know the basics, what are the possible questions for an interview? How might the question be asked?

The interview small q

Small q

Assume that a system is required to have a TPS (Transaction Per Second or Task Per Second) of at least 20, then assume that each Transaction is completed by one thread, continuing with the assumption that the average Transaction time Per thread is 4s

How to design the number of threads so that 20 transactions can be processed in 1s?



But, but, this is because CPU numbers are not taken into account. There is no ore at home, the average server CPU core is 16 or 32, if there are 80 threads, then there will be too much unnecessary thread context switching overhead (I hope you can say this on your own), which needs to be tuned to achieve the best balance

Small q 2

Calculation operation takes 5ms, DB operation takes 100ms, how do you set the number of threads on an 8-CPU server?

If you don’t know, please take the final exam of grade 3 and do it again (stay for tonight’s self-study). The answer is:

Number of threads = 8 * (1 + 100/5) = 168

If the upper limit of Query Per Second (QPS) is 1000, how many threads should be set to?



Does increasing the number of CPU cores necessarily solve the problem?

Seeing this, some students may think that even though I have calculated the theoretical number of threads, the actual NUMBER of CPU cores is not enough, which will bring the overhead of thread context switch. Therefore, we need to increase the number of CPU cores in the next step. Can we blindly increase the number of CPU cores necessarily solve the problem?

In talking about mutex, I intentionally left out one thing:



How do you understand this formula?



This result tells us that if our serialization rate is 5%, then whatever technology we use will only improve performance by a factor of 20.

How to understand the serial percentage in a simple and crude way (in fact, the tool can get this result)? Here’s a tip:

Tips: Critical sections are serial, non-critical sections are parallel, and the time it takes to execute a critical section with a single thread/time it takes to execute a critical section with a single thread (critical section + non-critical section) is the serial percentage

Now you know what I mean when I talk about synchronized:

Minimize the size of the critical section, because the size of the critical section is often the bottleneck problem, rather than trying to use a try catch all at once

conclusion

Multithreading is not necessarily more efficient than single-threading, such as the famous Redis (more on that later), because it is memory based, in which case single-threading can use CPU more efficiently. Multithreaded usage scenarios typically involve a significant proportion of I/O or network operations

In addition, combined with primary school math problem, we’ve learned how to process from the qualitative to the quantitative analysis, before starting without any data, we can use the above experience as a standard, the second is to gradually with the actual tuning (integrated CPU, memory, hard disk read and write speed, network status, etc.)

Finally, blindly increasing the number of CPU cores does not necessarily solve our problem, which requires us to write strict concurrent program code

Soul asking

  1. Why create a thread pool when we already know how many threads are appropriate?
  2. What does it take to create a thread? Why is frequent thread creation expensive?
  3. Multithreading is usually concerned with shared variables, so why are local variables not thread-safe?
  4. .

Wechat search BGM7756, free access to a full set of architecture information!



Space is limited! Some information pictures are as follows!