The concurrency feature of Go is one of the highlights of the language. Today we are going to take a look at how to use Go to better develop concurrent programs.

We all know that the core of the computer is the CPU, which is the computing and control core of the computer, bearing all the computing tasks. In the last half century, due to the rapid development of semiconductor technology, the number of transistors in integrated circuits is also growing greatly, which greatly improves the performance of CPU. The famous Moore’s Law, “the number of circuits on an integrated circuit chip doubles every 18 months,” describes this situation.

Dense transistors, while improving CPU processing performance, have also led to the problem of too much heat and cost per chip, while the increase in the number of transistors in a chip has slowed due to advances in material technology. In other words, programs can no longer simply rely on hardware improvements to make them run faster. With the advent of multi-core cpus, we saw another way to speed up the execution of a program: divide the execution of a program into several steps that can be executed concurrently or concurrently, execute them simultaneously in different CPU cores, and finally combine the results of each part of the execution to get the final result.

Parallelism and concurrency are common concepts for the execution of computer programs. They differ in:

  • Parallel means that two or more programs are executed at the same time;
  • Concurrency refers to the execution of two or more programs in the same time period.

Parallel execution of a program, whether viewed from a macro or micro perspective, more than one program is executing in the CPU at the same time. This requires the CPU to provide multi-core computing power, where multiple programs are assigned to different cores of the CPU to be executed simultaneously.

Concurrent programs, on the other hand, only need to observe multiple programs executing simultaneously in the CPU from a macro perspective. Even single-core CPU can also allocate certain execution time slices to multiple programs through time-sharing multiplexing, so that they can be executed quickly in CPU rotation, so as to simulate the effect of simultaneous execution of multiple programs at a macro level. But from a micro point of view, these programs are actually executed sequentially in the CPU.

MPG thread model for Go

Go is considered a high-performance concurrency language, thanks to its native support for coroutine concurrency. Let’s begin by understanding the relationships and differences between processes, threads, and coroutines.

In multi-program system, process is a dynamic execution process of a program with independent function about a certain data set. It is the basic unit of operating system for resource allocation and scheduling and the carrier of application program operation.

And thread is a single sequence control process in the process of program execution, is the basic unit of CPU scheduling and dispatching. A thread is a basic unit of independent operation smaller than a process. A process can have one or more threads. These threads share the resources held by the process and are scheduled for execution in the CPU to jointly complete the execution task of the process.

In Linux, the operating system divides the memory space into kernel space and user space according to different access permissions. The code in the kernel space can directly access the underlying resources of the computer, such as CPU resources and I/O resources, and provides the access ability for the code in the user space. User space is the activity space of the upper application program, which cannot directly access the underlying resources of the computer. Therefore, it is necessary to call the resources provided by the kernel space by means of “system call” and “library function”.

Similarly, threads can be divided into kernel threads and user threads. Kernel thread is managed and scheduled by the operating system and is the kernel scheduling entity. It can directly operate the underlying resources of the computer and make full use of the advantages of CPU multi-core parallel computing. However, when switching threads, CPU needs to switch to the kernel state, which has a certain overhead. User threads are created, managed, and scheduled by user-space code and cannot be perceived by the operating system. The data of the user thread is stored in the user space, and there is no need to switch to the kernel state when switching. The switching cost is small and efficient, and the number of threads that can be created is theoretically only related to the size of memory.

Coroutines are user threads and lightweight threads. The scheduling of coroutines is completely controlled by user-space code. Coroutines have their own register context and stack and are stored in user space; During coroutine switching, it is not necessary to switch to kernel state to access kernel space, and the switching speed is very fast. However, this also brings great technical challenges to the developer: the developer needs to deal with the saving and recovery of context information during coroutine switching in user space, and the management of stack space size.

Go is one of the few languages that implements coroutine concurrency at the language level, using a special two-level threading model: the MPG threading model (shown below).

MPG thread model

  • M, machine, is equivalent to the mapping of the kernel thread in the Go process. It corresponds to the kernel thread one by one and represents the resource that actually performs the computation. During the lifetime of M, it will be associated with only one kernel thread.
  • P, for processor, represents the context in which the Go code fragment is executed. The combination of M and P can provide an effective operating environment for G, and the combination relationship between them is not fixed. The maximum number of P determines the concurrency size of the Go program, as determined by the runtime.GOMAXPROCS variable.
  • G, or Goroutine, is a lightweight user thread that encapsulates the code snippet, with information about the stack, state, and code snippet at execution time.

In the actual execution process, M and P jointly provide an effective running environment for G (as shown in the figure below), and multiple executable G are mounted sequentially under the executable G queue of P, waiting for scheduling and execution. When M is blocked by some I/O system calls in G, P will disconnect from M and obtain an M from the idle M queue of the scheduler or create a new combined execution of M to ensure that other G in the executable G queue in P can be executed. Moreover, since the number of M parallel execution in the program does not change, Ensure the program CPU high utilization rate.

Diagram of the combination of M and P

When the system call in G returns at the end of execution, M captures a P context for G, and if the capture fails, G is put on the global executable G queue to wait for other PS. The newly created G is placed in the global executable G queue, waiting for the scheduler to dispatch to the appropriate executable G queue of P. When M and P are combined, G execution is obtained without lock from the executable G queue of P. When P’s executable G queue is empty, P locks G from the global executable G queue. When there is no G in the global executable G queue either, P tries to “steal” G executions from other P’s executable G queues.

Goroutine and channel

Because of the interdependence of resources and race conditions, a certain concurrency model is needed to coordinate the execution of tasks between different threads. Go advocates the use of CSP concurrency model to control the task collaboration between threads. CSP advocates the use of communication to share memory between threads.

Go implements the CSP concurrency model with Goroutine and channel:

  • Goroutines, or concurrent entities in Go, are lightweight user threads that send and receive messages;
  • Channels, or channels, are used by Goroutine to send and receive messages.

The CSP concurrency model is similar to the common synchronous queue. It pays more attention to the message transmission mode and decouples the goroutine that sends the message from the Goroutine that receives the message. Channels can be created and accessed independently and used in different Goroutines.

Use the keyword go to execute the code snippet concurrently using goroutine in the following form:

go expression
Copy the code

Channel, as a reference type, needs to specify the transmission data type. The declaration form is as follows:

var name chan T / / two-way channel
var name chan <- T // Channel that can only send messages
var name T <- chan // Channel that can only receive messages
Copy the code

Where, T is the data type that channel can transmit. A channel, acting as a queue, follows a first-in, first-out order of messages and ensures that only one Goroutine can send or receive messages at a time. Using a channel to send and receive messages is as follows:

channel <- val // Send a message
val := <- channel // Receive the message
val, ok := <- channel // Receive messages without blocking
Copy the code

A Goroutine blocks itself by sending a message to a channel that is already filled with information or receiving a message from a channel that has no data. Goroutine receives messages in a non-blocking manner, returning immediately whether or not a message exists in the channel, and checking for success with an OK Boolean. To create a channel, use the make function to initialize the channel as follows:

ch := make(chan T, sizeOfChan)
Copy the code

When initializing a channel, you can specify the length of the channel, indicating how many pieces of information the channel can cache. Let’s use a simple example to demonstrate the use of Goroutine and channel:

package main
import (
"fmt"
"time"
)
/ / producer
func Producer(begin, end int, queue chan<- int) {
for i:= begin ; i < end ; i++ {
fmt.Println("produce:", i)
queue <- i
}
}
/ / consumer
func Consumer(queue <-chan int) {
for val := range queue  { // Current consumers recycle
fmt.Println("consume:", val)
}
}
func main(a) {
queue := make(chan int)
defer close(queue)
for i := 0; i < 3; i++ {
go Producer(i * 5, (i+1) * 5, queue) // Multiple producers
}
go Consumer(queue) // Single consumer
time.Sleep(time.Second) // Avoid the main goroutine termination
}
Copy the code

This is a simple example of multi-producer and single-consumer code. The production Goroutine sends the production number to the consumption Goroutine through a channel. In the example above, the consuming Goroutine uses for:range to loop messages from a channel, and the loop ends only when the corresponding channel is closed by the built-in function. A channel can no longer be used to send messages after it is closed, but can continue to be used to receive messages. A goroutine that receives messages from a closed channel or is blocking will receive zero and return. Another important point to note is that the main function is started by the main goroutine, and when the main goroutine is finished, the entire Go program is finished, regardless of whether there are other goroutines that are not finished.

  1. Select multiplexing

When you need to receive messages from multiple channels, you can use the SELECT keyword provided by Go, which provides multiplexing-like capabilities that allow goroutine to wait for reads and writes from multiple channels at the same time. The select statement is similar to the switch statement, but the case statement must be followed by a channel operation.

package main
import (
"fmt"
"time"
)
func send(ch chan int, begin int )  {
// Loop messages to a channel
for i :=begin ; i< begin + 10; i++{ ch <- i } }func receive(ch <-chan int)  {
val := <- ch
fmt.Println("receive:", val)
}
func main(a)  {
ch1 := make(chan int)
ch2 := make(chan int)
go send(ch1, 0)
go receive(ch2)
// The main goroutine sleeps for 1s to ensure a successful scheduling
time.Sleep(time.Second)
for {
select {
case val := <- ch1: // Read data from CH1
fmt.Printf("get value %d from ch1\n", val)
case ch2 <- 2 : // Use CH2 to send messages
fmt.Println("send value by ch2")
case <-time.After(2 * time.Second): // Set timeout
fmt.Println("Time out")
return}}}Copy the code

In the example above, we use the select keyword to both receive data from CH1 and send data from CH2. One possible result of the output is:

get value 0 from ch1
get value 1 from ch1
send value by ch2
receive: 2
get value 2 from ch1
get value 3 from ch1
get value 4 from ch1
get value 5 from ch1
get value 6 from ch1
get value 7 from ch1
get value 8 from ch1
get value 9 from ch1
Time out
Copy the code

The message in CH2 is received only once. Therefore, send Value by CH2 is displayed only once, and subsequent messages are blocked. The SELECT statement selects the returned case from the three cases for processing. When multiple case statements are returned at the same time, the SELECT statement randomly selects a case for processing. If the select statement ends with a default statement, the select statement becomes non-blocking. That is, if all other case statements are blocked and cannot return, the SELECT statement directly executes the default statement to return the result. In the above example, we specify the timed return channel in the last case statement using < time-.after (2 * time.second), which is an efficient trick to timeout return from a blocked channel.

  1. The Context Context

Context implementation can be used when Context information needs to be passed across multiple Goroutines. In addition to passing Context information, Context can also be used to pass signals that terminate the execution of a subtask, or terminate multiple goroutines that execute subtasks. Context provides the following interfaces:

type Context interface {
Deadline() (deadline time.Time, ok bool)
Done() <-chan struct{}
Err() error
Value(key interface{}) interface{}}Copy the code
  • The Deadline method returns the time when the Context was cancelled, i.e. the Deadline for completion of the work;
  • Done returns a channel that is closed when the current work is Done or the context is cancelled. Multiple calls to the Done method return the same channel.
  • The Err method, which returns the cause that Context ended. It returns a nonempty value only when the channel returned by Done is closed, or a Canceled error if Context is Canceled. If the Context times out, a DeadlineExceeded error is returned.
  • Value method, which can be used to get the passed key Value information from the Context.

In the process of Web request processing, a request may start multiple Goroutines to work together, these goroutines may need to share information about the request, and when the request is cancelled or the execution times out, all the goroutines corresponding to the request need to be quickly terminated to release resources. Context was developed to address the above scenario, which we demonstrate with the following example:

package main
import (
"context"
"fmt"
"time"
)
const DB_ADDRESS  = "db_address"
const CALCULATE_VALUE  = "calculate_value"
func readDB(ctx context.Context, cost time.Duration)  {
fmt.Println("db address is", ctx.Value(DB_ADDRESS))
select {
case <- time.After(cost): // simulate a database read
fmt.Println("read data from db")
case <-ctx.Done():
fmt.Println(ctx.Err()) // The reason for the task cancellation
// Some cleaning up}}func calculate(ctx context.Context, cost time.Duration)  {
fmt.Println("calculate value is", ctx.Value(CALCULATE_VALUE))
select {
case <- time.After(cost): // Simulate data calculation
fmt.Println("calculate finish")
case <-ctx.Done():
fmt.Println(ctx.Err()) // The reason for the task cancellation
// Some cleaning up}}func main(a)  {
ctx := context.Background(); // Create an empty context
// Add context information
ctx = context.WithValue(ctx, DB_ADDRESS, "localhost:10086")
ctx = context.WithValue(ctx, CALCULATE_VALUE, 1234)
// Execute timeout return after setting the child Context for 2s
ctx, cancel := context.WithTimeout(ctx, time.Second * 2)
defer cancel()
// Set the execution time to 4 seconds
go readDB(ctx, time.Second * 4)
go calculate(ctx, time.Second * 4)

// Fully implemented
time.Sleep(time.Second * 5)}Copy the code

In the example above, we simulated a simultaneous database access and logical computation in a request, closing the unfinished Goroutine in case the request timed out. We start by adding context information to the context using the context.WithValue method. Context is concurrently safe in multiple Goroutines, and it is safe to read context data in multiple Goroutines. We then use the context.WithTimeout method to set the context timeout to 2s and pass it to readDB and Calculate for the two Goroutine subtasks. In the readDB and Calculate methods, the Context’s Done channel is monitored using a SELECT statement. Since we set the child Context to time out after 2s, it will close the Done channel after 2s; However, the default execution time of the subtask is 4s, the corresponding case statement has not returned, the execution is cancelled, and the case statement of the cleanup work is entered, ending the task of the current Goroutine. The expected output is as follows:

calculate value is 1234
db address is localhost:10086
context deadline exceeded
context deadline exceeded
Copy the code

Using Context, you can effectively pass shared values, cancellation signals, deadlines, and other information across a set of Goroutines, closing goroutines that are not needed in a timely manner.

summary

This article mainly introduces the concurrency features of Go language, including:

  • MPG thread model of Go;
  • Goroutine and channel;
  • Select multiplexing;
  • Context Context.

In addition to the concurrency model of CSP, Go also supports the traditional thread and lock concurrency model, providing a series of synchronization tools such as mutex, read and write locks, concurrent wait groups, and synchronous wait conditions. The structure of these synchronization tools is located in the Sync package, which is similar to the use of synchronization tools in other languages. Go supports coroutine concurrency at the language level and has excellent performance in concurrency, which can fully exploit the computing performance of multi-core CPUS. Hopefully, this article will improve your understanding of Go concurrent design and programming.

Recommended reading

In microservice architecture, ELK is used for log collection and unified processing

How do I use Prometheus and Grafana to monitor an early warning service cluster?

Subscribe to the latest articles, welcome to follow my official account