😋 MY name is Ping Ye. There is an open source project “Go Home” focusing on Gopher technology growth.
takeaway
Many of you have heard that the Go language naturally supports high concurrency due to its built-in Goroutine support, which enables thousands of coroutines to be launched in a single process. So how can it achieve such high concurrency? You need to understand what the concurrency model is.
Concurrency model
Herb Sutter, a noted C++ expert, once said that “the free lunch is over.” To make code run faster, faster hardware alone is no longer enough. We need multiple cores to exploit the value of parallelism, and the purpose of the concurrency model is to show you how different executing entities work together.
Of course, different concurrency models cooperate in different ways. There are seven common concurrency models:
- Thread and lock
- Functional programming
- The way of Clojure
- actor
- Communication Sequence Process (CSP)
- Data-level parallelism
- Lambda architecture
Today, we will only talk about the concurrency model CSP, which is related to Go. If you are interested, please refer to the book “Seven Concurrent Models in seven Weeks”.
CSP article
CSP, short for Communicating Sequential Processes, is one of the seven concurrent models. Its core idea is to connect two concurrent entities through a channel, and all messages will be transmitted through a channel. In fact, the concept of CSP was first proposed by Tony Hall in 1978, and it has recently gained popularity thanks to the rise of Go.
So how does CSP relate to the Go language? Next, let’s look at the implementation of THE CONCURRENT model of CSP by Go language — GPM scheduling model.
GPM scheduling model
GPM represents three roles: Goroutine, Processor, and Machine.
- Goroutine: is our common use of the go keyword to create the implementation, it corresponds to a structure G, the structure holds the Goroutine stack information
- Machine: indicates the operating system thread
- Processor: indicates the Processor used to establish the connection between G and M
Goroutine
Goroutine is used in the code go keywords to create execution unit, also known as a “lightweight thread,” said the coroutines, coroutines is not known to the operating system, it is composed of a programming language level implementation, context switch does not need through the kernel mode, plus coroutines memory space is very small, so has a very big development potential.
go func(a){} ()Copy the code
In Go, Goroutine is represented by a very complex structure called Runtime. Go, which has more than 40 member variables and stores execution stack, state, currently occupied threads, and scheduling-related data. There is also a Goroutine logo that you really want to get, but I am sorry that the official set it to private in consideration of the development of the Go language and will not call you 😏.
type g struct {
stack struct {
lo uintptr
hi uintptr
} // Stack memory: [stack.lo, stack.hi)
stackguard0 uintptr
stackguard1 uintptr
_panic *_panic
_defer *_defer
m *m // The current m
sched gobuf
stktopsp uintptr // Expect sp to be at the top of the stack for backtracking
param unsafe.Pointer // wakeUp The argument passed when waking up
atomicstatus uint32
goid int64
preempt bool // Preempt signal, stackGuard0 = stackPreempt copy
timer *timer // The cached timer for time.sleep. }Copy the code
Goroutine scheduling-related data is stored in Sched, which is used when coroutine switches and restores context.
type gobuf struct {
sp uintptr
pc uintptr
g guintptr
ret sys.Uintreg
...
}
Copy the code
Machine
By default, GOMAXPROCS is set to the number of cores. If there are four cores, then by default, four threads are created, each with a runtime. M structure. The reason for the number of threads equal to the number of cpus is that each thread allocated to one CPU does not have to be context-switched by threads, keeping system overhead to a minimum.
type m struct {
g0 *g
curg *g
...
}
Copy the code
There are two important things in M, one is G0 and the other is curg.
- G0: Deeply involved in runtime scheduling, such as goroutine creation, memory allocation, etc
- Curg: Represents the goroutine currently executing on the thread.
Just now, P is responsible for the association between M and G, so M also stores data related to P.
type m struct{... p puintptr nextp puintptr oldp puintptr }Copy the code
- P: the processor running the code
- Nextp: transient processor
- Old: processor of the thread before the system call
Processor
Proccessor is responsible for the connection between the Machine and the Goroutine. It can provide the context required by the thread, and also allocate G to the thread it should go to execute. With Proccessor, every G can get a reasonable call, and every thread will no longer fish in troubled waters, which is a necessary good product for home.
Similarly, the number of processors is set by default to GOMAXPROCS, which corresponds to the number of threads.
type p struct {
m muintptr
runqhead uint32
runqtail uint32
runq [256]guintptr
runnext guintptr
...
}
Copy the code
Structure P stores fields related to performance tracking, garbage collection, timers, and so on. It also stores the processor’s queue to run, which stores a list of goroutines to execute.
The relationship between the three
First, four threads and four processors are started by default, and then bound to each other.
At this point, a Goroutine structure is created, and after updating the function body address, parameter start address, parameter length, and scheduling properties, it is queued up by a processor to be dispatched.
What, create another G? Then take turns to put other P inside bai bai, believe you queue to get the number when you see other window no one queue will also past.
What if I have a bunch of G’s, and they all fill up? Instead of putting G in the processor’s private queue, put it in the global queue (waiting hall).
In addition to plug in, M side also crazy to fetch out, first to the processor’s private queue to fetch G, if the global queue to fetch, if there is no global queue, to steal other processor queue, wow, so hungry, it is the devil ah!
What if I can’t find a G to execute anywhere? Then M would be so disappointed that he would disconnect from P and go to sleep (idle).
What if two Goroutines are doing something affectionate with a channel and get blocked? Is M going to wait until they’re done? Obviously not, M doesn’t care about the Go couple, and will turn around and find another G to perform.
The system calls
If G makes the syscall call, M will also enter the syscall state, and then the P will be wasted. The neat thing about this is that P doesn’t wait around for the G and M system calls to complete, but instead finds other M’s to execute other G’s.
When G completes the system call, it must find another free processor to start because it needs to continue to execute.
If there are no free processors, G is put back on the global queue for allocation.
sysmon
Sysmon is our cleaning aunt, it is an M, also called monitoring thread, it does not need P to run independently, it will wake up every 20us~10ms to clean up, the main work is to recycle garbage, recycle P blocked for a long time system scheduling, send preemption scheduling to G running for a long time and so on.
entry
Tony Hall
Tony Hall is a British computer scientist and Turing prize winner who designed the powerful quicksort algorithm, Hall logic and the CSP model. He was awarded the John Von Neumann Award in 2011.
Thank you for watching, if you feel the article is helpful to you, welcome to pay attention to the public account “Ping ye”, focus on Go language and technology principle.