Before you learn about coroutines, you need to know what a coroutine is, right? Coroutines are also called microthreads. A program can contain more than one coroutine, which can be compared to a process containing more than one thread. We know that multiple threads are relatively independent, have their own context, and the switch is controlled by the system; Coroutines are also relatively independent and have their own context, but their switching is controlled by themselves. A coroutine is a thread execution in which two subprocedures cooperate to complete a task. Coroutines are similar to a subroutine call, except that a coroutine interrupts inside a subroutine to execute another subroutine, returning when appropriate, unlike a function call.

All right, without further ado, let’s go straight to an example and use it to understand this not particularly easy to understand concept. Then, we’ll go from the shallow to the deep, to the core of the coroutine.

Let’s start with a simple crawler example:

import time def crawl_page(url): print('crawling {}'.format(url)) time.sleep(2) print('OK {}'.format(url)) a=['url1', 'url2', 'url3', 'url4'] start=time.time() for url in a: Time () print('use {}'. Format (end-start)) ### Crawling URL4 OK URL4 use 8.007500886917114Copy the code

This is a very simple crawler function execution, call crawl_page() function for network communication, after 2 seconds to receive the result, and then execute the next. Four URL crawls took 8s. Let’s look at the implementation of coroutines.

import asyncio async def crawl_page(url): print('crawling {}'.format(url)) await asyncio.sleep(2) print('OK {}'.format(url)) async def run(urls): for url in urls: await crawl_page(url) start=time.time() asyncio.run(run(['url1', 'url2', 'url3', 'URL1 ']) end=time.time() print('use {}s'. Format (end-start)) #### output URL1 OK URL1 OK URL2 CRAWLING Url3 OK URL4 USE 8.012088060379028sCopy the code

You might notice that it still takes 8 seconds, no different than sequential execution, what the hell. You’re right. Let’s move on with that question. Before we get there, let’s take a look at this code, starting with import Asyncio, which contains most of the magic tools we need to implement coroutines.

The async modifier declares asynchronous functions, so both crawl_page and run here become asynchronous functions. By calling an asynchronous function, we get a coroutine object.

Let’s talk about the implementation of coroutines. Here are three commonly used ones.

We can call with await. The effect of await execution is the same as normal execution of Python, that is, the program will block here, go into the coroutine function being called, and return again after execution, which is the literal meaning of await. Sleep (sleep_time) will rest here for a few seconds and await crawl_page(URL) will execute the crawl_page() function.

We can create tasks via asyncio.create_task(), which we’ll cover in more detail in the next class.

We need asyncio.run to trigger a run. Asyncio.run is a feature of Python since 3.7. It makes Python’s coroutine interface very simple, and you don’t have to worry about how event loops are defined and used (we’ll get to that later).

At this point, we should be able to understand the above code, do not understand it does not matter, we continue to analyze. Remember that the above await is a synchronous call, so the crawl_page(URL) will not trigger the next call until the current call is finished. So, this code is exactly the same as above, which is equivalent to writing synchronous code with asynchronous interface. Let’s actually write asynchronous code.

import asyncio async def crawl_page(url): print('crawling {}'.format(url)) await asyncio.sleep(2) print('OK {}'.format(url)) async def run(urls): tasks = [] for url in urls: tasks.append(asyncio.create_task(crawl_page(url))) for task in tasks: await task start=time.time() urls=['url1', 'url2', 'url3', 'URL4 '] asyncio.run(run(urls)) end=time.time() print('use {}s'. Format (end-start)) #### ##### crawling URL1 crawling Url2 OK URL1 OK URL2 OK URL3 OK URL4 use 2.0025851726531982sCopy the code

As you can see, once we have the coroutine object, we can create tasks with asyncio.create_task. The task is scheduled to be executed soon after it is created, so that our code does not block the task. “For task in tasks: await task”.

Let’s take a closer look at the execution of coroutines.

import asyncio async def worker_1(): print('worker_1 start') await asyncio.sleep(1) print('worker_1 done') async def worker_2(): print('worker_2 start') await asyncio.sleep(2) print('worker_2 done') async def run(): task1 = asyncio.create_task(worker_1()) task2 = asyncio.create_task(worker_2()) print('before await') await task1 print('awaited worker_1') await task2 print('awaited worker_2') start=time.time() asyncio.run(run()) end=time.time() Print ('use {} s. format(end-start)) ##### result ##### before await worker_1 start worker_2 start worker_1 done Worker_1 worker_2 done Worker_2 USE 2.001868963241577sCopy the code

Let’s perform an analysis.

  1. Asyncio.run (run()), the program enters the run() function and the event loop starts.
  2. Task1 and Task2 tasks are created and enter an event loop waiting to run. Run to print and print ‘before await’;
  3. Await task1 execution, the user selects to cut from the current main task, and the event scheduler starts scheduling worker_1;
  4. Worker_1 starts running, print ‘worker_1 start’, then run to await asyncio.sleep(1), cut out from the current task, and the event scheduler starts scheduling worker_2;
  5. Worker_2 starts running, print ‘worker_2 start’, then run await asyncio.sleep(2) to cut out from the current task;
  6. One second later, worker_1’s sleep completes, and the event scheduler passes control back to Task_1, printing ‘worker_1 done’. Task_1 completes the task and exits the event loop.
  7. Await task1, the event scheduler passes the controller to the main task, outputs’ worker_1′, and waits again at await task2;
  8. Two seconds later, worker_2’s sleep completes, and the event scheduler passes control back to Task_2, printing ‘worker_2 done’. Task_2 completes its task and exits the event loop.
  9. Main task output ‘WORKer_2’, coroutine complete task end, the end of the event loop.

Finally, let’s sum up what we’ve done today.

The difference between coroutine and multithreading mainly lies in two points. First, coroutine is a single thread. The second is that the coroutine lets the user decide where to hand over control and move on to the next task.

Coroutines are written more succinctly and with a combination of async, await, and create_task, so that there is no pressure for small or medium-level concurrency requirements.

Welcome to leave a message and I exchange.