Before sharing thread monitoring, let’s talk about the basics of threads. Generally speaking, as long as we have a solid foundation, we are not prone to making mistakes when writing code in most cases. However, when the Number of Android team reaches dozens or even hundreds of people, we cannot ensure that all students can write good code step by step, so we still need to have monitoring, but it is not enough to have monitoring and need to have a theoretical basis, so that problems can be analyzed and solved. A lot of you think that there is nothing good to know about threads, except synchronized, volatile, and newThread. Many students may not even touch the thread pool and lock lock, to be honest, EARLIER years I also like you, because the project is not used to have to look at the source code principle.

1. Context switch

In the old single-CPU era, a single task could only execute a single program at a point in time. Then came the multi-task stage, in which the computer could perform multiple tasks or processes in parallel at the same time. Although not in the true sense of the “same point in time”, but multiple tasks or processes share a CPU, and the operating system to complete the multi-task CPU switching, so that each task has a chance to get a certain time slice to run. Later, it developed into multi-threading technology, which enables multiple threads to execute in parallel within a program. The execution of a thread can be thought of as a CPU executing the program. When a program runs in multiple threads, it is as if more than one CPU is executing the program at the same time. Multithreading is more challenging than multitasking. Multithreading is executed in parallel within the same program, so concurrent reads and writes are performed on the same memory space. This is probably never a problem in a single-threaded program. Some of these errors are also unlikely to occur on a single-CPU machine, as the two threads are never really executed in parallel. However, more modern computers come with multi-core cpus, which means that different threads can be executed in true parallel by different CPU cores. Therefore, in multi-threading, multi-task situations, thread context switch is necessary, but you should be familiar with the concepts in CPU architecture design, so that you can understand the principle of thread context switch.

Multi-process multi-thread in the process of running are inseparable from a concept, that is scheduling. The JVM, while cross-platform, does not take over thread scheduling, which is determined by the operating system itself, as we will see next time when we look at the underlying source code for thread creation. Scheduling involves the concept of context switching. Multitasking is essentially the rotation of CPU time slices. In a multitasking system, the CPU needs to handle all the operations of the program, and when the user switches them back and forth, they need to record where the program is executed. Context switching is one such process that allows the CPU to record and restore the state of various running programs, enabling it to complete the switching operation. To put it more simply, the CPU switches from one process or thread to another.

During a context switch, the CPU stops processing the currently running program and saves the exact location of the current program for later execution. From this point of view, context switching is a bit like reading several books at the same time. We need to remember the current page number of each book as we switch back and forth. In a program, the “page number” information during the context switch is stored in the PROCESS control block (PCB). A PCB is also often referred to as a switchframe. Page numbers are stored in the CPU’s memory until they are used again. PCB is usually a system memory footprint area of a continuous storage area, storage of the operating system is used to describe the process condition and all the information necessary to control the operation of process, it makes a in a multiprogramming environment can’t be a stand-alone program can run independently of the basic unit or a process that can execute concurrently with other processes.

For an executing process, program counters, registers, and current values of variables are stored in THE CPU registers, and these registers can only be used by the process that is using the CPU. First of all, the data of the previous process must be saved (so that the next time the CPU usage is obtained, the sequential execution can continue from the last break, instead of returning to the beginning of the process, otherwise, the tasks processed by the process when the CPU is regained are the last repeat, and may never reach the end of the process. Since it is almost impossible for a process to complete all the tasks before releasing the CPU), it can then load the data of the process that acquired the CPU into the CPU register and continue to execute the remaining tasks from the last breakpoint.

Context switching costs both direct and indirect factors that affect program performance. ** Direct consumption: ** refers to CPU registers that need to be saved and loaded, system scheduler code that needs to be executed, TLB instances that need to be reloaded, CPU pipelines that need to be brushed off; Indirect consumption: data is shared between multiple caches. The impact of indirect consumption on the application depends on the size of the operation data in the thread workspace. Therefore, we should consider two issues when multithreading: the first is to minimize the number of context switches, and the second is to maximize CPU utilization.

2. Memory model

Before I introduce the Java memory model, let’s take a look at what a computer memory model is, and then what the Java memory model does on top of the computer memory model. Let’s see why we have a memory model.

We all know that when a computer executes a program, every instruction is executed in the CPU, and when it executes, it has to deal with data. The data on the computer is stored in main memory, the physical memory of the computer. At first, nothing happened, but as CPU technology developed, CPU execution became faster and faster. Since the technology of memory has not changed much, the process of reading and writing data from memory has become more and more different from the CPU’s execution speed, which causes the CPU to spend a lot of time waiting for each memory operation. So, people came up with a good idea, is to increase the cache between the CPU and memory. The concept of caching is well known, keeping a copy of data. It is characterized by fast speed, small memory, and expensive. So, execution process becomes: when the program is in the process of operation, operation will need to copy a data from main memory into the CPU’s cache, then when the CPU calculation can directly from the cache data read and write data to it, when the operation is done, then the data in the cache refresh into main memory. With the continuous improvement of CPU capacity, one layer of cache is slowly unable to meet the requirements, and gradually derived from multi-level cache. The CPU cache can be divided into level-1 cache (L1), level-2 cache (L3) and some high-end cpus have level-3 cache (L3) according to the order of data reading and the degree of close integration with the CPU. All data stored in each level cache is part of the next level cache. The technical difficulty and manufacturing cost of these three caches are relatively decreasing, so their capacity is also relatively increasing. So, with multilevel caching, the execution of a program becomes: when the CPU reads a piece of data, it looks from the level-1 cache, if it doesn’t find it, it looks from the level-2 cache, or if it doesn’t find it, it looks from level-3 cache or memory. A single-core CPU contains only one L1, L2, and L3 cache. If the CPU has multiple cores, that is, a multi-core CPU, each core has a L1 (or even L2) cache and shares L3 (or L2) caches.

From the above analysis, this can lead to a problem, that is, multi-threaded CUP cache consistency problems, also known as ** atomicity problems, visibility problems and order problems, etc. Its essence is the sequela brought by CPU cache optimization. ** problems have to solve the problem, according to our usual common idea is to back version, abolish the processor and processor optimization technology, abolish CPU cache, let the CPU directly and main memory interaction, this is certainly not possible. Therefore, the memory model was born to solve the sequelae caused by CPU cache optimization. The Java memory model is as follows:

3. Analysis of common thread problems

Java backend engineers often encounter some problems such as high CPU, high Load, slow response, etc. As Android engineers, we rarely deal with concurrent requests, so we rarely dig into threads. The problem may be a lot of things, but the essence of the problem is the same, which is why I keep saying that you need to lay a solid foundation and spend a lot of time on the Linux kernel and system source code. Suffice it to say, the threading problems we encounter in the Android scenario can be analyzed from both the Linux kernel and the JVM’s memory model.

3.1. How to use thread pools

There are many thread pool parameters: core thread count, maximum thread count, queue, and so on. How do you use it in the real world? In fact, there are only two points mentioned above:

  • The first is to minimize the number of context switches and create as few processes as possible
  • The second is to maximize CPU utilization by creating as many processes as possible

This may seem like a conflict at first glance, but in a real scenario it is not. For example, we have analyzed the source code for OkHttp in the system architecture. Let’s take a look at the thread pool it uses internally:

Public synchronized ExecutorService ExecutorService () {if (ExecutorService == null) { MAX_VALUE executorService = new ThreadPoolExecutor(0, integer.max_value, 60, timeUnit.seconds, new SynchronousQueue<Runnable>(), Util.threadFactory("OkHttp Dispatcher", false)); } return executorService; }Copy the code

3.2. Differences between synchronized and Lock

Synchronized underlying principle of the implementation of the previous analysis here will not do too much introduction, lock source code this requires you to have a look, there are a lot of online articles we can also go to assist to understand. Here’s just one difference you might not have noticed: Synchronized can cause context switches if it fails to compete for a lock, which is why you should never lock without multithreaded security. The underlying principle of a lock, however, is to wait for data refreshes on the main line. It may seem like Lock is a better option to reduce the number of context switches, but it’s not entirely true.

Video link: pan.baidu.com/s/1pZA2udae… Video password: 87uh