A process is the smallest unit of resource allocation and a thread is the smallest unit of CPU scheduling
CPU
Threads are the basic unit of CPU scheduling. Can a CPU distinguish between threads?
- The reality is that the CPU does not know the concepts of threads, processes, etc.
The CPU only knows two things:
- 1. Access instructions from within
- 2. Run the command, repeat 1
How do I get the instructions?
- Those of you who have studied computers know that from the PC register, the PC register points to the address of the instruction in memory.
How do instructions form?
- The program or the function we write is compiled to form instructions.
How do we get the CPU to execute a function?
- Obviously we only need to find the first instruction that the function is compiled into, which is the function entry.
What does CPU have to do with threads?
Operating system and processes
According to the above conclusion, if we want the CPU to execute a function, all we need to do is load the first machine execution of the function into the PC register.
We need:
-
Find a region loader in memory of the right size.
-
Locate the function entry and set up the PC register for the CPU to start executing the program.
-
These two steps are by no means easy, and if a programmer did them manually every time he ran a program, he would go crazy, so a smart programmer might want to write a program to automate them. (Operating system)
Machine instructions need to be loaded into memory for execution, so the start address and length of memory need to be recorded. Also find the entry address of the function and write it to the PC register, and think about whether you need a data structure to record this information. The data structure is roughly as follows:
struct *** { void* start_addr; intlen; void* start_point; . };Copy the code
- There has to be a name for this data structure, and what information does this structure record? What does it look like when a program is loaded from disk into memory? (process)
The first function executed by the CPU is also named (the main function).
Did we forget about threads?
Multicore era
How do you use multicore?
Start multiple processes? However, there are several major problems:
- If multiple processes are based on the same executable, the contents of the memory area of these processes are almost identical, which obviously causes memory waste.
- Computer processing tasks may be more complex, which involves inter-process communication, because each process is in different memory address space, inter-process communication naturally needs to rely on the operating system, which increases the programming difficulty and also increases the system overhead.
Process to thread
Let’s think about this question carefully, the process is a region of memory, the section area, save the CPU machine instruction and the function of execution run-time stack information, to make the process running, the main function of the first machine instruction register address written to the PC, this process is up and running.
The disadvantage of a process is that there is only one entry function, the main function, so machine instructions in a process can only be executed by one CPU. Is there a way to have multiple cpus execute machine instructions in the same process?
If we can write the address of main’s first instruction to a PC register, what’s the difference between main and other functions?
The answer is no, the main function is special only because it is the first function executed by the CPU, nothing else. We can point the PC register to main, and we can point the PC register to any function.
When we point the PC register to a function other than main, a thread is born.
Remember threads are the basic unit of resource allocation?
Afterword.
Finally, the analogy of zhihu big guy is more reasonable: process = train, thread = carriage
- Threads move under a process (pure cars cannot run)
- A process can contain more than one thread (a train can have more than one car)
- It is difficult to share data between different processes (passengers on one train cannot easily move to another, such as station transfers)
- Data is easily shared between different threads in the same process (it is easy to switch from car A to car B)
- Processes consume more computer resources than threads (using more trains than cars)
- Processes do not interact with each other, the failure of one thread will result in the failure of the entire process (one train will not affect another train, but if the middle car on one train catches fire, it will affect all trains)
- The process can be extended to multiple machines, and the process is suitable for multiple cores at most (different trains can run on multiple tracks, and cars of the same train cannot run on different tracks).
- The memory address used by a process can be locked, meaning that when a thread uses some shared memory, other threads must wait for it to terminate before they can use it. (train bathroom) -” Mutex”
- The memory address used by the process can be used to limit the amount of memory used (e.g. restaurant on train, how many people are allowed to enter, if full, you have to wait at the door until someone comes out) – “semaphore”