preface
I’m sure you’ve all heard of coroutines.
But some of you are vaguely familiar with the concept, wondering how to implement it, how to use it, where to use it, and even thinking that yield is a coroutine!
I have always believed that if you cannot express a point of knowledge accurately, I can assume that you do not understand it.
If you’ve learned about using PHP for coroutines, you must have seen bird elder brother of the article, using coroutines in PHP realize the task scheduling | the corner of the snow
Bird brother translated this article from foreign authors, the translation is concise, but also gave specific examples.
The purpose of my writing this article is to make a more adequate supplement to brother Bird’s article. After all, some students’ basic knowledge is not good enough, and their understanding is also unclear.
Personally, I don’t like to write long articles and follow me on Weibo
@ yards cloudAnd share knowledge on weibo every day. The article was also recorded on my blog:
bruceit.com/p/A4kSfE
What is a coroutine
So let’s be clear about what a coroutine is.
You’ve probably heard the terms “process” and “thread.”
A process is a running instance of a binary executable file in computer memory, just like your.exe file is a class, and the process is the new instance.
A process is the basic unit of resource allocation and scheduling in a computer system. Only one process can be processed per CPU at a time.
The so-called parallelism is just the appearance of parallelism, and the CPU is actually switching between different processes at very fast speeds.
The process switch requires a system call. The CPU saves the information about the current process and disables the CPUCache.
So don’t do it unless you have to.
So how do you implement “not switching processes unless you have to”?
The process is switched when the process is complete, the CPU time slice allocated to the process ends, the system is interrupted, or the process waits for necessary resources (the process is blocked). You think, the first few cases of nature no words to say, but if it is blocking waiting, is not wasted.
In fact, if blocked, our program has other executable places to execute, do not have to wait silly!
So there are threads.
A thread is simply a “microprocess” that runs a single function (logical flow).
So we can write programs in the process of running functions can be represented by threads.
There are two types of threads, one managed and scheduled by the kernel.
We say that anything that involves the kernel being involved in managing scheduling is costly. This kind of threading actually solves the problem that when one of the executing threads in a process is blocked, we can schedule another runnable thread to run, but it is still in the same process, so there is no process switching.
There is another kind of thread whose scheduling is managed by the programmer himself and is not visible to the kernel. Such threads are called “user-space threads”.
Coroutines can be understood as user-space threads.
Coroutines have several characteristics:
- Synergy, because it is the scheduling policy written by the programmer, switches through collaboration rather than preemption
- Create, switch, and destroy in user mode
- ⚠️ From a programming perspective, the idea of coroutines is essentially a yield and resume mechanism for controlling flow
- Generators are often used to implement coroutines
At this point, you understand the basic concept of coroutines?
PHP implements coroutines
Step by step, start by explaining the concept!
iterable
PHP5 provides a way to define an object so that it can be iterated through a list of cells, for example with a foreach statement.
If you want to implement an iterable, you implement the Iterator interface:
<?php
class MyIterator implements Iterator
{
private $var = array();
public function __construct($array)
{
if (is_array($array)) {
$this->var = $array;
}
}
public function rewind() {
echo "rewinding\n";
reset($this->var);
}
public function current() {
$var = current($this->var);
echo "current: $var\n";
return $var;
}
public function key() {
$var = key($this->var);
echo "key: $var\n";
return $var;
}
public function next() {
$var = next($this->var);
echo "next: $var\n";
return $var;
}
public function valid() {
$var = $this->current() !== false;
echo "valid: {$var}\n";
return $var;
}
}
$values = array(1,2,3);
$it = new MyIterator($values);
foreach ($it as $a => $b) {
print "$a: $b\n";
}Copy the code
The generator
In the past, you had to implement a bunch of methods in order to have an object that could be traversed by foreach, so the yield keyword was used to simplify the process.
Generators provide an easier way to implement simple object iterations with significantly less performance overhead and complexity than defining classes to implement the Iterator interface.
<? php function xrange($start, $end, $step = 1) { for ($i = $start; $i <= $end; $i += $step) { yield $i; } } foreach (xrange(1, 1000000) as $num) { echo $num, "\n"; }Copy the code
Remember, in a function, if you useyield
, it is a generator, directly call it is useless, cannot be equal to a function to execute!
So, yield is yield, and the next time you say yield is a coroutine, I’ll XXXX you.
PHP coroutines
Coroutines require programmers to write their own scheduling mechanisms. Let’s see how to write this mechanism.
0) Generator used correctly
Since generators cannot be called directly like functions, how can they be?
The method is as follows:
- The foreach he
- send($value)
- current / next…
1) Task implementation
A Task is an abstraction of a Task. We just said that a coroutine is a user-space coroutine. A thread can be understood as running a function.
So the constructor of a Task receives a closure function, which we’ll call coroutine.
/** * class Task {protected $taskId; protected $coroutine; protected $beforeFirstYield = true; protected $sendValue; /** * Task constructor. * @param $taskId * @param Generator $coroutine */ public function __construct($taskId, Generator $coroutine) { $this->taskId = $taskId; $this->coroutine = $coroutine; } @return mixed */ public function getTaskId() {return $this->taskId; } @return bool */ public function isFinished() {return! $this->coroutine->valid(); } /** * Sets the value to be passed to the coroutine next time, such as $id = (yield $XXXX), * * @param $value */ public function setSendValue($value) {$this->sendValue = $value; } /** * @return mixed */ public function run() { If ($this->beforeFirstYield) {$this->beforeFirstYield = false; return $this->coroutine->current(); $retval = $this->coroutine->send($this->sendValue); $this->sendValue = null; return $retval; }}}Copy the code
2) Scheduler implementation
Next comes the key core of the Scheduler, which acts as the dispatcher.
/** * Class Scheduler */ Class Scheduler { /** * @var SplQueue */ protected $taskQueue; /** * @var int */ protected $tid = 0; /** * public function __construct() {/** public function __construct() {/** public function __construct() { $this->taskQueue = new SplQueue(); $this->taskQueue (); } public function addTask(Generator $task) {$tid = $this->tid; $task = new Task($tid, $task); $this->taskQueue->enqueue($task); $this->tid++; return $tid; $this-> Task ->enqueue($Task); $this-> Task ->enqueue($Task); } /** * public function run() {while (! $this->taskQueue->isEmpty()) {$this->taskQueue->dequeue(); $res = $task->run(); // Run the task until yield if (! $task->isFinished()) { $this->schedule($task); // If the task is not completed, join the team and wait for the next execution}}}}Copy the code
So we’ve basically implemented a coroutine scheduler.
You can test this by using the following code:
<? php function task1() { for ($i = 1; $i <= 10; ++$i) { echo "This is task 1 iteration $i.\n"; yield; }} function task2() {for ($I = 1; $i <= 5; ++$i) { echo "This is task 2 iteration $i.\n"; yield; }} $scheduler = new scheduler; $scheduler->addTask(task1()); $scheduler->addTask(task2()); $scheduler->run();Copy the code
The key is where you can get PHP coroutines.
Function task1() {/* Remote_task_commit (); /* remote_task_commit(); // Instead of waiting for the request to be sent, we will yield the CPU execution to task2. yield (remote_task_receive()); . } function task2() { for ($i = 1; $i <= 5; ++$i) { echo "This is task 2 iteration $i.\n"; yield; // Make the execution of the CPU active}}Copy the code
This improves the efficiency of the program.
As for the implementation of “system call”, Bird has been very clear, I will not explain here.
3) Coroutine stack
There is another example of a coroutine stack.
As we said above, if you use yield in a function, you can’t use it as a function.
So you have a coroutine function nested within another coroutine function:
<? php function echoTimes($msg, $max) { for ($i = 1; $i <= $max; ++$i) { echo "$msg iteration $i\n"; yield; } } function task() { echoTimes('foo', 10); // print foo ten times echo "---\n"; echoTimes('bar', 5); // print bar five times yield; // force it to be a coroutine } $scheduler = new Scheduler; $scheduler->addTask(task()); $scheduler->run();Copy the code
EchoTimes cannot be implemented here! So you need a coroutine stack.
But never mind, let’s change the code that we just did.
Change the initialization method in Task, because when we run a Task, we need to figure out which subcoroutines it contains, and then store the subcoroutines on a stack. (Those of you who are good at C will understand this, but those of you who are not, I suggest you take a look at how the memory model of a process handles function calls.)
/** * Task constructor. * @param $taskId * @param Generator $coroutine */ public function __construct($taskId, Generator $coroutine) { $this->taskId = $taskId; // $this->coroutine = $coroutine; $this->coroutine = stackedCoroutine($coroutine); }Copy the code
When Task->run(), a loop is used to analyze:
/** * @param Generator $gen */ function stackedCoroutine(Generator $gen) { $stack = new SplStack; // Iterate over the passed generator for (; ;) $value = $gen->current(); If ($value instanceof Generator) {$stack->push($gen); if ($value instanceof Generator) {$stack->push($gen); $gen = $value; // Pass the subcoroutine function to gen and continue, noting that the process of executing the subcoroutine continues; $isReturnValue = $value instanceof CoroutineReturnValue; // the subcoroutine returns' $value 'and needs the main coroutine to handle if (! $gen->valid() || $isReturnValue) { if ($stack->isEmpty()) { return; $gen = $stack->pop(); $gen = $stack->pop(); $gen->send($isReturnValue? $value->getValue() : NULL); // Call the output value of the main coroutine processing child coroutine continue; } $gen->send(yield $gen->key() => $value); // continue executing the subcoroutine}}Copy the code
Then we added the end tag for echoTime:
class CoroutineReturnValue { protected $value; public function __construct($value) { $this->value = $value; Public function getValue() {return $this->value; return $this->value; } } function retval($value) { return new CoroutineReturnValue($value); }Copy the code
Then modify echoTimes:
function echoTimes($msg, $max) { for ($i = 1; $i <= $max; ++$i) { echo "$msg iteration $i\n"; yield; } yield retval(""); // Add this as an end tag}Copy the code
The Task becomes:
function task1()
{
yield echoTimes('bar', 5);
}Copy the code
This has implemented a coroutine stack, and now you can make an analogy.
4) The yield from keyword in PHP7
PHP7 added yield from, so we don’t need to implement the Ctrip stack ourselves, which is great.
Change the Task constructor back to:
public function __construct($taskId, Generator $coroutine) { $this->taskId = $taskId; $this->coroutine = $coroutine; // $this->coroutine = stackedCoroutine($coroutine); // Do not need to implement, change to the previous}Copy the code
EchoTimes function:
function echoTimes($msg, $max) { for ($i = 1; $i <= $max; ++$i) { echo "$msg iteration $i\n"; yield; }}Copy the code
Task1 generator:
function task1()
{
yield from echoTimes('bar', 5);
}Copy the code
This makes it easy to call subcoroutines.
conclusion
So now you know how to implement PHP coroutines?
End…