Article source: www.cnblogs.com/yuxingfirst…

Introduction to the

 

As multicore processors become more common, is there an easy way to write software that delivers multicore power? The answer is: Yes. With the rise of programming languages designed for concurrency, such as Golang, Erlang, and Scale, new patterns of concurrency are becoming clear. As with procedural programming and object orientation, a good programming pattern requires an extremely concise kernel and, on top of that, a rich extension that can solve a wide variety of real-world problems. This paper takes GO language as an example to explain the kernel and denotation.

 

Kernel for concurrent mode

 

The kernel in this concurrent mode only needs coroutines and channels. Where coroutines execute code and channels pass events between coroutines.

  

Concurrent programming has always been a very difficult task. To write a good concurrent program, we have to understand how threads, locks, semaphore, barriers, and even cpus update caches, and they all have quirks and pitfalls. I would never manipulate these low-level concurrency elements myself unless I had to. A clean concurrency pattern does not require these complex underlying elements, just coroutines and channels.

 

Coroutines are lightweight threads. In procedural programming, when a procedure is called, it waits for its execution to complete before returning. When a coroutine is called, it returns immediately without waiting for it to finish. Coroutines are very lightweight, and Go can execute tens of thousands of coroutines in a single process while still maintaining high performance. On a normal platform, with thousands of threads per process, the CPU is busy with context switching and performance deteriorates dramatically. It’s not a good idea to create random threads, but we can use coroutines a lot.

A channel is a data transmission channel between coroutines. A channel can pass data, either a value or a reference, between many coroutines. Channels can be used in two ways.

· The coroutine can try to put data into the channel, and if the channel is full, it will suspend the coroutine until the channel can put data into it.

· The coroutine can attempt to demand data from the channel, and if the channel has no data, it will suspend the coroutine until the channel returns data.

In this way, the channel can control the operation of the coroutine while passing data. It’s kind of event-driven, it’s kind of blocking queue. These two concepts are very simple, and each language platform has its own implementation. There are also libraries for both Java and C.

  

As long as coroutines and channels exist, concurrency problems can be solved elegantly. There is no need to use other concepts related to concurrency. So how do you solve all kinds of practical problems with these two knives?

 

Extension of concurrent mode

 

Coroutines, in contrast to threads, can be created in large numbers. By opening the door, we can expand into new uses for generators, functions that return “services,” loops that execute concurrently, and shared variables. But the emergence of new uses also brings new thorny problems, coroutines can also leak, improper use can affect performance. The various usages and problems are described below. The demo code is written in GO because it is concise and fully functional.

 

The generator

 

Sometimes we need to have a function that is constantly generating data. For example, this function can read files, read networks, generate self-growing sequences, generate random numbers. These behaviors are characterized by known variables of the function, such as the file path. And then we keep calling it, returning new data.

The following random number generation as an example, to make a random number generator will be executed concurrently.

The non-concurrent approach looks like this:

// The rand_generator_1 function returns int

funcrand_generator_1() int {

         return rand.Int()

}

 

Above is a function that returns an int. If rand.int () takes a long time to call, the caller will hang. So we can create a coroutine that specifically executes rand.int ().

 

// rand_generator_2 returns Channel

funcrand_generator_2() chan int {

// Create channel

         out := make(chan int)

// Create coroutines

         go func() {

                   for {

// Write data to the channel and wait if no one reads

                            out <- rand.Int()

                   }

} ()

         return out

funcmain() {

// Generate random numbers as a service

         rand_service_handler :=rand_generator_2()

// Reads a random number from the service and prints it

         fmt.Printf(“%d\n”,<-rand_service_handler)

}

Rand.int () is executed concurrently. It is worth noting that the return of a function can be understood as a “service”. But when we need to get random data, we can call this service at any time, and it has already prepared the corresponding data for us, no need to wait, at any time. If we call the service infrequently, one coroutine will suffice. But what if we need a lot of interviews? Using the multiplexing technique described below, we can start several generators and combine them into one large service.

Call the generator to return a “service.” Can be used to continuously retrieve data. It can be used for reading data, generating ids, and even timers. This is a very neat way of thinking about concurrency.

multiplexing

Multiplexing is a technique for processing more than one queue at a time. Apache uses a process for each connection, so its concurrency performance is not very good. Nginx, on the other hand, uses multiplexing to allow a single process to handle multiple connections, so concurrency performance is better. Similarly, in the case of coroutines, multiplexing is also required, but with a difference. Multiplexing can combine several similar small services into one large service.

So let’s use multiplexing to make a higher concurrency random number generator.

// function rand_generator_3 returns Channel

funcrand_generator_3() chan int {

// Create two random number generator services

         rand_generator_1 := rand_generator_2()

         rand_generator_2 := rand_generator_2()

 

// Create channel

         out := make(chan int)

 

// Create coroutines

         go func() {

                   for {

// Read the data in generator 1

                            out <-<-rand_generator_1

                   }

} ()

         go func() {

                   for {

// Read the data in Generator 2

                            out <-<-rand_generator_2

                   }

} ()

         return out

}

This is a highly concurrent version of a random number generator using multiplexing. By integrating two random number generators, this version is twice as powerful. Although coroutines can be created in large numbers, many coroutines compete for output channels. The Go language provides the Select keyword to solve, each also has its own tricks. Increasing the buffer size of the output channel is a common solution.

Multiplexing can be used to integrate multiple channels. Improves performance and ease of operation. Used with other modes has great power.

The Future technology

Futures are a useful technique, and we often use them to manipulate threads. When we use threads, we can create a thread, return a Future, and then wait for the result. But a Future in a coroutine environment can be more thorough, and the input parameters can also be Future.

When you call a function, the parameters are always ready. The same is true when coroutines are called. But if we set the parameters passed as channels, then we can call the function without preparing the parameters. Such a design provides a great deal of freedom and concurrency. The two processes of function invocation and function parameter preparation are completely decoupled. Here is an example of accessing a database using this technique.

// a query structure

typequery struct {

/ / parameters of the Channel

         sql chan string

/ / the result of the Channel

         result chan string

}

/ / Query execution

funcexecQuery(q query) {

// Start the coroutine

         go func() {

// Get input

                   sql := <-q.sql

// Access the database, output the result channel

                   q.result <- “get” + sql

} ()

}

funcmain() {

// Initialize Query

         q :=

                   query{make(chan string, 1),make(chan string, 1)}

// Execute Query, noting that no parameter is prepared during execution

         execQuery(q)

 

// Prepare parameters

         q.sql <- “select * fromtable”

// Get the result

         fmt.Println(<-q.result)

}

Using the Future technology above, not only the results are retrieved in the Future, but the parameters are also retrieved in the Future. When the parameters are ready, they are automatically executed. The difference between a Future and a generator is that a Future returns a result, whereas a generator can be called repeatedly. It is also worth noting that the parameter Channel and the result Channel are defined as parameters in a structure, rather than returning the result Channel. This increases the degree of aggregation and the advantage is that it can be used in combination with multiplexing techniques.

Future technology can be combined with various other technologies. Multiplexing technology can be used to monitor multiple result channels, when there is a result, automatically return. It can also be used in combination with generators, which continuously produce data, and Future technology, which processes data one by one. The Future technology itself can also be connected end to end to form a concurrent Pipe filter. This pipe filter can be used to read and write data streams and manipulate data streams.

Future is a very powerful technology. You can call it without worrying about whether the data is ready or the return value has been calculated. Make the components in the program run automatically when the data is ready.

The concurrent circulation

Loops are often a performance hotspot. If the performance bottleneck is on the CPU, the hot spot is 90% likely to be within a loop body. So if you can have the body of the loop execute concurrently, then performance is much better.

Concurrent loops are as simple as starting coroutines inside each body of the loop. Coroutines can be executed concurrently as the body of a loop. Set a counter before the call starts, add an element on the counter after the execution of each loop body, and wait for the completion of the loop coroutine by listening to the counter after the call.

// Set up the counter

sem :=make(chan int, N);

/ / a FOR loop

for i,xi:= range data {

// Build coroutines

    go func (i int, xi float) {

        doSomething(i,xi);

/ / count

        sem <- 0;

    } (i, xi);

}

// Wait for the loop to end

for i := 0; i < N; ++i { <-sem }

Here is an example of a concurrent loop. Wait through the counter for the loop to complete. With the Future technology mentioned above, you don’t have to wait. You can wait until you get the results you really need to check that the data is complete.

Performance can be provided through concurrent loops, utilizing multiple cores, and addressing CPU hot spots. Because coroutines can be created in large numbers, they can be used in this way in the body of a loop. If you’re using threads, you need to introduce something like a thread pool to prevent you from creating multiple threads, whereas coroutines are much simpler.

ChainFilter technology

As mentioned earlier, the Future technology can be connected end to end to form a concurrent pipe filter. This approach can do a lot of things, and if each Filter consists of the same function, there is an easy way to link them together.

Since each Filter coroutine can be run concurrently, this structure is beneficial for multi-core environments. Here is an example of using this pattern to produce prime numbers.

// Aconcurrent prime sieve

packagemain

 

// Sendthe sequence 2, 3, 4, … to channel ‘ch’.

funcGenerate(ch chan<- int) {

         for i := 2; ; i++ {

                  ch<- i // Send ‘i’ to channel ‘ch’.

         }

}

// Copythe values from channel ‘in’ to channel ‘out’,

//removing those divisible by ‘prime’.

funcFilter(in <-chan int, out chan<- int, prime int) {

         for {

                   i := <-in // Receive valuefrom ‘in’.

if i%prime ! = 0 {

                            out <- i // Send’i’ to ‘out’.

                   }

         }

}

// Theprime sieve: Daisy-chain Filter processes.

funcmain() {

         ch := make(chan int) // Create a newchannel.

         go Generate(ch)      // Launch Generate goroutine.

         for i := 0; i < 10; i++ {

                   prime := <-ch

                   print(prime, “\n”)

                   ch1 := make(chan int)

                   go Filter(ch, ch1, prime)

                   ch = ch1

         }

}

The above program creates 10 filters, one for each prime, so the first 10 primes can be printed.

Chain-filter creates a Chain of concurrent filters in simple code. Another advantage of this approach is that only two coroutines will access each channel, so there will be no fierce competition and the performance will be better.

Shared variables

**** Communication between coroutines can only be over channels. But we’re used to sharing variables, and a lot of times using shared variables makes code cleaner. For example, a Server has two states on and off. What about others who just want to get or change their state. This variable can be placed in channel 0 and maintained using a coroutine.

The following example describes how to implement a shared variable in this way.

// A shared variable consists of a read channel and a write channel

typesharded_var struct {

         reader chan int

         writer chan int

}

// Share variable maintenance coroutines

funcsharded_var_whachdog(v sharded_var) {

         go func() {

/ / initial value

                   var value int = 0

                   for {

// Listen on the read/write channel to complete the service

                            select {

                            case value =<-v.writer:

                            case v.reader <-value:

                            }

                   }

} ()

}

funcmain() {

// Initialize and start maintaining coroutines

         v := sharded_var{make(chan int),make(chan int)}

         sharded_var_whachdog(v)

// Read the initial value

         fmt.Println(<-v.reader)

// Write a value

         v.writer <- 1

// Read the newly written value

         fmt.Println(<-v.reader)

}

In this way, you can implement a coroutine safe shared variable based on coroutines and channels. Define a write channel into which new values are written whenever a variable needs to be updated. Define a read channel from which to read when needed. Both channels are maintained through a separate coroutine. Ensure data consistency.

In general, it is not recommended to use shared variables to interact with coroutines, but in this way, it may be desirable in some cases. Many platforms have native support for shared variables, so it’s a matter of opinion which implementation is better. In addition, various common concurrent data structures, such as locks, can be implemented using coroutines and channels, which will not be described here.

 

Coroutines leak

Coroutines, like memory, are resources of the system. For memory, there is automatic garbage collection. But for coroutines, there is no corresponding reclamation mechanism. Will coroutines become as common as memory leaks in a few years? In general, the coroutine is destroyed after execution. Coroutines can also consume memory, and if a coroutine leak occurs, the impact is just as bad as a memory leak. At light it slows down the process, at heavy it overwhelms the machine.

C and C++ are both programming languages that don’t have automatic memory reclamation, but with good programming habits, you can get around this problem. It’s the same thing for coroutines, just good habits.

There are only two situations in which a coroutine cannot be terminated. In one case, the coroutine wants to read data from a channel, but no one is writing to the channel, perhaps the channel has been forgotten. There is also a case where an program wants to write data to a channel, but because no one is listening to the channel, the coroutine will never execute down. Here’s how to avoid both.

This is the case when a coroutine wants to read from a channel, but no one is writing to the channel. The solution is simple: add a timeout mechanism. In cases where there is uncertainty about whether a return will be made, a timeout must be added to avoid a permanent wait. Also, you don’t have to use a timer to terminate a coroutine. You can also expose an exit alert channel. Any other coroutine can use this channel to alert this coroutine to termination.

For coroutines that want to write data to a channel, but the channel is blocked. The solution is simple: buffer the channel. But only if the channel receives a fixed number of writes. For example, if you know that a channel will receive data at most N times, set the buffer for that channel to N. The channel will never be blocked and the coroutine will never leak. You can also set the buffer to unlimited, but you run the risk of memory leaks. After the execution of the coroutine, this part of the channel memory will be unreferenced and will be garbage collected automatically.

funcnever_leak(ch chan int) {

// Initialize timeout with buffer 1

         timeout := make(chan bool, 1)

// Start the timeout coroutine. Since the cache is 1, it cannot be leaked

         go func() {

                   time.Sleep(1 * time.Second)

                   timeout <- true

} ()

// Listen to the channel, because there is a timeout, can not leak

         select {

         case <-ch:

                   // a read from ch hasoccurred

         case <-timeout:

                   // the read from ch has timedout

         }

}

The above is an example of avoiding leakage. Use timeouts to avoid read jams and buffering to avoid write jams.

As with objects in memory, we don’t have to worry about leaks for long-lived coroutines. One is long-term existence, and the other is a small number. The only things to watch out for are temporary coroutines, which are large in number and have a short lifetime, often created in a loop, and apply the aforementioned approach to avoid leaks. Coroutines can also be a double-edged sword. If something goes wrong, it can crash your program rather than improve its performance. But just like memory, there is also a risk of leakage, but the more you use it, the more it slips away.

 

Implementation of concurrent patterns

In today’s world of concurrent programming, support for coroutines and channels has become an integral part of every platform. Although each has its own name, all can meet the basic requirements of football programming – concurrent execution and mass creation. The author summed up the way of their implementation.

Here are some common languages and platforms that already support coroutines.

GoLang and Scala, the newest languages, were born with sophisticated coroutine based concurrency capabilities. Erlang, the oldest concurrent programming language, has been rejuvenated. Other second-tier languages almost all have coroutines in the new version.

It is surprising that C/C++ and Java, the world’s three most dominant platforms, do not provide language level native support for coroutines. They are all burdened with a history that can’t be changed and doesn’t need to be. But there are other ways they can use coroutines.

The Java platform has a number of ways to implement coroutines:

· Modify the VIRTUAL machine: Patch the JVM to implement coroutines, which works well but loses the cross-platform benefits

· Modify bytecode: Enhance bytecode after compilation, or use a new JVM language. Slightly more difficult to compile.

· Use JNI: Use JNI in Jar packages, which is easy to use, but not cross-platform.

· Using thread emulation coroutines: Make coroutines heavyweight and completely dependent on the JVM’s thread implementation.

Among them, the way of modifying bytecode is more common. Because of this approach, performance and portability can be balanced. The most representative JVM language, Scale, supports coroutine concurrency well. Akka, the popular Java Actor model class library, is also a coroutine implemented with bytecode modifications.

For C, coroutines are the same as threads. This can be done using a variety of system calls. As a relatively advanced concept, coroutines are implemented in too many ways to be discussed. The mainstream implementations are libpcl, coro, lThread, etc.

For C++, there are Boost implementations, as well as several other open source libraries. There is also a language called μC++ that provides concurrent extensions on top of C++.

This programming model is widely supported in many language platforms and is no longer a niche. You can always add it to your toolbox if you want to use it.

 

conclusion

 

This article explores a very concise concurrency model. In the case where there are only two basic components, coroutines and channels. It can provide rich functions and solve all kinds of practical problems. And this model has been widely implemented as a trend. This concurrency model is far less powerful than that, and there are bound to be more concise uses of it. Perhaps there will be as many CPU cores as there are neurons in the human brain, and at that point we will have to rethink the concurrency model.

\