If I had to pick one great feature of Go, it would have to be the built-in concurrency model. Go not only supports concurrency, but also makes it better and easier to use. The Go concurrency model (Goroutine) is for concurrent programming what Docker is for virtualization.

What is concurrency

In computer programming, concurrency refers to the ability of a computer to handle multiple tasks simultaneously. For example, if you’re surfing the Web in a browser, there may be a lot of things going on at the same time. Let’s say you’re downloading some files while scrolling to listen to music. So the browser needs to do both at the same time. If the browser can’t handle these problems, it can be painful for users to wait until all the download tasks are complete before they can go back to the site.

A general-purpose PC may have only one CPU core for all tasks, and one CPU core can handle one thing at a time. When we talk about concurrency, we’re talking about allocating chunks of CPU time to things that need to be processed. So we feel like there’s a lot going on at the same time.

Let’s see how a CPU-managed Web browser can handle the content in our sample diagram.

In the figure above, you can see that a single-core processor divides the workload according to the priority of each task. For example, listening to music may be a low priority when the page is scrolling, so sometimes your concert stops due to low Internet speed, but you can still scroll the page. A single processor schedules the execution of tasks by switching time slices so that users feel that multiple tasks are being executed at the same time.

What is parallelism

Then the question becomes, what if we have multiple cores in our CPU? In fact, modern cpus are all multi-core architectures. If a CPU has more than one processing core, we call it a “multi-core processor.” You may have heard this term when buying a computer or smartphone, for example, the laptop I’m currently working on is 2-core, which is pretty LOW compared to the current advanced PC CPUS. andCommercial servers typically reach 64 cores of processing power and multiple processes that can handle multiple tasks at the same time. In the previous Web browser example, our single-core processor had to allocate CPU time to different task objects. With multi-core processors, we can run different tasks simultaneously in different cores, as you can see below.The concept of running multiple tasks at the same time is called parallelism. When our CPU has multiple cores, we can use different CPU cores to perform multiple tasks at the same time, so we can quickly complete a multi-task task.

Concurrency vs parallelism

Go recommends using Goruntines on only one kernel, but we can modify the Go program to run Goruntines on different processor cores all over again.

There are several differences between concurrency and parallelism. Concurrency is processing multiple things alternately, parallelism is processing multiple things simultaneously. Is parallelism necessarily more beneficial to be concurrent? Not necessarily. We’ll talk about that in a future podcast

Now, there are probably a lot of questions flying through your head. You may have built parallelism and concurrency, but you might want to know how to use Go’s concurrency architecture to do it. Before we do that, let’s look at computer processes.

What is a computer process?

When you write a computer program in C, Java, or Go, it’s just a text file. But since computers only understand binary instructions made up of zeros and ones, this code needs to be translated into machine language. This is where compilers come in. In scripting languages like Python and JS, the interpreter does the same thing. When a compiled program is sent to the operating system for processing, the operating system allocates different things, such as memory address space (where the process’s heap and stack reside), program counters, process IDS (PID), and other critical things. The process has at least one more thread called the main thread, which can create multiple other threads. When the main thread completes, the process exits.

So we can understand that a process is a container that compiles code, memory, different operating system resources, and other things that can be made available to a thread. In short, a process is a program in memory.

What is a computer thread?

A thread is the actual executor of a piece of code. Threads can access memory, operating system resources, and other things provided by processes. The execution program code is that the area of memory in which the thread stores variables (data) is called a stack, where temporary variables take up stack space. The stack is created at run time and usually has a fixed size, preferably 1-2MB. The thread stack can only be used by the change thread and will not be shared with other threads. The heap is a property of the process and can be used by any thread. The heap is a shared memory space where data in one thread can also be accessed by other threads.

Now we have an overview of processes and threads. But what use are they?

When you launch a Web browser, you must have some code that calls OS process operations. This means that we are creating a process, and one process may manipulate the OS to create another process for the new TAB. When the browser TAB opens and you are performing your daily tasks, the TAB will start creating different threads for incompatible activities (such as page scrolling, downloading, listening to music, etc.), as we saw in the previous two processes processing task diagrams. Here’s a Chrome app task diagram for MacOSThe figure shows the different processes that Google Chrome uses for open tabs and internal services. Since each process has at least one more thread, we can see that the number of threads is greater than the number of processes.

In multithreading, in the case of multiple threads in a process, a thread with a memory leak may exhaust resources needed by other layers and cause the process to become unresponsive. When using a browser or any other program, you’ve probably encountered an unresponsive process that the task manager tells you to kill.

Thread scheduling

When multiple threads run in serial or parallel mode, the threads may share some data, so that only one thread can access specific data at a time to ensure the safe execution of the task. We refer to the execution of multiple threads in some order as scheduling. Operating system threads are scheduled by the kernel, and some threads are managed by the runtime environment of a programming language (e.g., Java’s runtime environment -JRE). A race condition is said to occur when multiple threads trying to access the same data at the same time cause the data to change or result in unexpected results.

When we design concurrent Go programs, the key is to find such contention conditions, and take appropriate measures to ensure safe operation of multithreaded programs under contention conditions.

Use concurrency in Go

Next, let’s discuss how to implement concurrency in Go code. We know that OOP languages such as Java and C++ have a thread class that allows us to create multiple thread objects in the current process. Because the Go language has no traditional OOP syntax, it provides the Go keyword to create Goruntine. When the go keyword is placed before a function call, it becomes Goruntine and is scheduled to be executed by GO.

In a future article, we will discuss the coroutine separately. For now, you can think of it as a thread. Technically, a coroutine behaves like a thread, and it is an abstraction of a thread.

When we run the Go program, the Go runtime creates a certain number of threads on a kernel. All Goruntine is multiplexed on this kernel. At any point in time, a thread executes a goroutine, and if that goroutine is stopped, it is replaced with another goroutine executed on that thread. This is somewhat similar to kernel thread scheduling, but handled by Go’s runtime and will be faster than kernel scheduling.

In most cases, it is recommended to run all goroutine on a single kernel, but if you need to schedule goroutine execution before your system’s multi-core kernel, you can use the GOMAXPROCS environment variable control. You can also use the runtime. GOMAXPROCS (n) (https://golang.org/pkg/runtime/#GOMAXPROCS) to adjust the runtime environment, where n is the number of core you should use. You may feel that setting GOMAXPROCS to 1 slows the program down. Depending on the nature of the program you are currently running, it is possible that the communication overhead between multiple cores will be greater than your running overhead, and operating system threads and processes will experience performance degradation, as will your Go program. Go has an M:N scheduler that schedules Go programs to execute on multiple processors. At any given time, M coroutines need to be scheduled on N OS threads running on GOMAXPROCS processors. At most, one thread is running per kernel at any time, but the scheduler can create more threads if needed, but this rarely happens. If you don’t start any Goroutine in your code, your application will only run in one thread, on one core, no matter how many cores you use.

Threads vs. coroutines

Since there is a clear difference between threads and coroutines, let’s use comparison terms to explain why thread overhead is higher than coroutines and why coroutines are key to our application’s high-level concurrency features.

These are a few important differences, and I recommend that you delve deeper into the implementation of the Go concurrency model, which will revolutionize your understanding of concurrent programming. To highlight the power of the Go coroutine model, consider a case study. Suppose you have a Web server that processes 1,000 requests per minute. If you have to run every request at the same time, that means you need to create 1000 threads or split them into different processes. This is how the classic Apache (https://www.apache.org/) server works. If each thread consumes a stack size of 1MB, that means you’ll be using 1GB of memory to handle the traffic. Of course, Apache provides the ThreadStackSize directive to manage the stack size per thread, but the problem remains unresolved. For Go writing, since the stack size can grow dynamically, you can generate 1000 Goruntine without any problem. Since the Goruntine’s initial stack space is adjustable, starting at 8KB (later Go versions may be smaller), it doesn’t consume much memory. And when you have a Goruntine where you need to recurse. Go can easily scale the stack up to 1GB, which is “doing the same thing at a lower cost.”

As mentioned above, a coroutine is executed on a thread at a time, and the coroutine and coroutine are coordinated by the Go runtime. The other coroutine is not scheduled by the occupied thread until the coroutine running on that thread is blocked. The following can block a coroutine:

  • Network flow input

  • Dormancy (sleeping)

  • Channel operation

  • Non-blocking synchronous package (https://golang.org/pkg/sync/) some primitive in the trigger

    We can consider that, assuming the coroutine does not block in this case, the blocked coroutine will cause the thread in which it is running to block, killing other coroutines that need to be scheduled, and we need careful programming to prevent this from happening. Channels and synchronization primitives play an important role in Go’s concurrent programming. We will analyze their principles and considerations in detail in the following articles.

    Through this article, we understand the concept of thread scheduling, as well as the concurrent use and coroutine scheduling model in Go. Finally, we compare threads and coroutines in detail, hoping that these comparisons can help you make better decisions in Go concurrent programming to achieve better performance. In future articles, we’ll explore the mysteries of Go concurrent programming with some actual code.

reference

In [1] form coroutines 8 KB stack design reference https://golang.org/doc/go1.2#stack_size

reference

Achieving concurrency in Go

https://golang.org/pkg/sync/