Go language is born for concurrency. Go language is one of the few languages that realize concurrency at the language level. It is the concurrency nature of the Go language that has attracted countless developers around the world.
Concurrency and Parallellism
Concurrency: When two or more tasks are being executed at a time. We don’t care if these tasks are executed simultaneously at any one point in time, maybe at the same time, maybe not, we only care if two or more tasks are executed at any one time, even for a short period of time (a second or two seconds).
Parallellism: ** Two or more tasks are executed simultaneously at the same time.
Concurrency is a logical concept, whereas parallelism is a physical state of operation. Concurrency “includes” parallelism.
(See Rob Pike’s powerpoint for details)
CSP concurrency model for Go
Go implements two forms of concurrency. The first is the common one: multi-threaded shared memory. This is multithreaded development in languages like Java or C++. The other one is specific to and recommended by Go: the CONCURRENT model of CSP (Communicating Processes).
CSP concurrency model was put forward in 1970, which is a relatively new concept. Different from the traditional multi-thread communication through shared memory, CSP pays attention to “sharing memory in the way of communication”.
Do not communicate by sharing memory; “Don’t communicate by sharing memory. Instead, share memory by communicating.”
Common thread concurrency models, such as Java, C++, or Python, use shared memory to communicate between threads. Typically, when accessing shared data (such as arrays, maps, or structures or objects), locks are used to access the data. Thus, in many cases, a convenient data structure is derived called “thread-safe data structure.” An example is the data structure in the package “java.util.concurrent” provided by Java. The traditional thread concurrency model is also implemented in Go.
The CSP concurrency model of Go is implemented through Goroutine and Channel.
goroutine
Is the unit of concurrent execution in Go. A bit abstract, in fact, and the traditional concept of “thread” similar, can be understood as “thread”.channel
Are the various concurrent constructs in Go language (goroutine
) previous communication mechanism. In layman’s terms, eachgoroutine
A “pipe” to communicate with each other, somewhat like a pipe in Linux.
The way to generate a Goroutine is very simple: Go, and it’s generated.
go f();
Copy the code
The communication mechanism channel is also very convenient, using channel < -data to transmit data and <-channel to fetch data.
In the process of communication, channel < -data and fetch <-channel must be paired, because the two Goroutines communicate with each other.
And whether a goroutine is passed or fetched is blocked until another goroutine is passed or fetched.
There are two Goroutines, one of which initiates a value pass operation to a channel. (Goroutine is rectangle, channel is arrow)
The goroutine on the left starts blocking, waiting for someone to receive it.
At this point, the Goroutine on the right initiates the receive operation.
The goroutine on the right is also blocking, waiting for someone to send it.
At this point, both goroutines found each other, and the two goroutines began to pass and receive.
This is the most basic form of the Golang CSP concurrency model.
The implementation principle of Go concurrency model
Let’s start with threads. Whatever concurrency model at the language level, at the operating system level, it must exist as threads. The operating system architecture can be divided into user space and kernel space according to the different access permissions of resources. Kernel space mainly operates to access CPU resources, I/O resources, memory resources and other hardware resources, providing the most basic basic resources for the upper application program, user space is the fixed activity space of the upper application program, user space can not directly access resources. Resources provided by the kernel space must be called by “system call”, “library function”, or “Shell script”.
Our current computer language can be considered as a “software” in a narrow sense. The so-called “threads” in them are often user-mode threads, and the operating system itself kernel mode threads (referred to as KSE), or there is a difference.
The implementation of thread model can be divided into the following ways:
User-level threading model
As shown in the figure, multiple user-mode threads correspond to a kernel thread, and the creation, termination, switching, or synchronization of program threads must be done by themselves.
Kernel-level threading model
This model directly calls the kernel threads of the operating system, and the creation, termination, switching, synchronization and other operations of all threads are completed by the kernel. C++ is just that.
Two-level threading model
This model is a kind of thread model between user-level thread model and kernel-level thread model. The implementation of this model is very complex, similar to the kernel-level thread model, a process can correspond to multiple kernel-level threads, but the threads in the process do not correspond to the kernel threads one by one; This thread model will first create multiple kernel-level threads, and then use its own user-level threads to correspond to the created multiple kernel-level threads. Its own user-level threads need to be scheduled by its own program, and the kernel-level threads are handed over to the operating system kernel for scheduling.
The thread model of Go language is a special two-level thread model. Call it the “MPG” model.
Go thread implements model MPG
M stands for Machine, and an M is directly associated with a kernel thread. P stands for “processor”, which represents the context required by M and the processor that handles user-level code logic. G stands for Goroutine, which is essentially a lightweight thread.
The relationship among the three is shown in the figure below:
The diagram above shows the case of two threads (kernel threads). Each M corresponds to a kernel thread, and each M also connects to a context P, which is equivalent to a “processor” and connects to one or more goroutines. The number of P(Processor) is set to the value of the environment variable GOMAXPROCS at startup, or by calling runtime.gomaxprocs () at runtime. A fixed number of processors means that only a fixed number of threads are running go code at any one time. In Goroutine is the code that we’re going to execute concurrently. The Goroutine P is executing is blue; Goroutines in the queue state are grey. Grey goroutines form runoutqueues
A macroscopic picture of the relationship between the three:
P off (Processor)
You may be wondering why a context is necessary, can we simply remove the context and make Goroutine’s Runqueues hang on M? The answer is no, the purpose of the context is to allow us to release other threads directly when the kernel thread is blocked.
A very simple example is the system call sysall. A thread must not execute code at the same time and the system call is blocked. In this case, the thread M needs to abandon the current context P so that the other Goroutine can be scheduled to execute.
As you can see in the left image above, G0 in M0 performs syscall, and then creates an M1. M0 then discards P, waits for syscall to return a value, M1 accepts P, and continues to execute other Goroutines in the Goroutine queue.
When the syscall call ends, M0 “steals” a context. If it fails, M0 puts its Gouroutine G0 into a global runqueue and either puts itself into the thread pool or goes to sleep. The global RunQueue is where each P pulls a new Goroutine after running its own local Goroutine runqueue. P also periodically checks the goroutines on the global runqueue, otherwise goroutines on the global runqueue may not be executed and starve to death.
Divide the work evenly
As stated above, context P periodically checks the goroutine in the global Goroutine queue so that it has something to do when it consumes its own Goroutine queue. What if the global Goroutine queue is also gone? Steal it from the runqueue of other running P’s.
Different goroutines in each P result in different efficiency and time. In an environment with many P’s and M’s, one P cannot run out of its own Goroutine and have nothing to do, because other P’s may have a long queue of Goroutines to run and need to balance. How to solve it?
Go’s approach is straightforward, stealing half of the other P’s!
Reference: The Go Scheduler, First Edition of Go Concurrent Programming