background

When we go to interviews, people like to ask questions about multithreading and coroutines. Does this piece of technology smell that good? The reason for asking about this technique is that the first one is that it is relatively abstract and not particularly easy to learn. The second is to look at your understanding of programming through these technical questions. The third piece of technology is critical to program optimization. From the above points, it’s not hard to see why interviewers like to test these techniques.

Furthermore, Python has been optimized for concurrent programming, known as multithreading with the Thread module and multiprocessing, with the gradual introduction of yield-based coroutines. Recent versions of Python have greatly improved the way coroutines are written, and many of the previous coroutines are no longer officially recommended.

Coroutines in Python go through roughly three stages

  • The original generator variant yield/send
  • yield from
  • The async/await keyword was introduced in Python3.5

This article will start directly with Python3.5. If you are interested, how did Python complete coroutines before the async/await keyword

content

Synchronous and asynchronous

Synchronization refers to a method in which code calls AN I/O operation and waits for the I/O operation to complete before returning. Multiple tasks must be executed in sequence. Only one task can be executed after the other one is completed

Asynchrony refers to a method in which code returns an I/O operation without waiting for the OPERATION to complete. There is no order in which multiple tasks can be executed at the same time. The order in which they are executed does not matter

Parallelism and concurrency

I take a single-core CPU. Single-core means doing one thing at a time. So how does software running at the same time on a computer do that?


QQ to the CPU for a short period of time, and then wechat to the CPU for a short period of time, finally everyone seems to run together, this CPU rotation strategy is called time slice rotation

Parallel: The CPU is greater than the currently executing task

Concurrent: The CPU is smaller than the current task

Coroutines, threads versus processes

  1. If there is a production line, recruit more workers on this production line to make scissors together, so that the efficiency is doubled and increased, that is, the single-process multi-threaded way

  2. The boss found that the more workers on the production line, the better, because the resources and materials of a production line are limited after all, so the boss spent some financial and material resources to buy another production line, and then recruited some workers so that the efficiency was further improved, that is, the way of multi-process and multi-thread

3. The boss found that there were already many production lines, and there were already many workers on each production line. In order to improve efficiency again, the boss thought of a bad trick. If an employee at work or temporary nothing to wait for some conditions (such as waiting for another worker finish production working procedure Before he can work again), then the employees will use this time to do other things, that is to say: if a thread is waiting for certain conditions, can make full use of the time to do other things, in fact this is: Coroutines way

Basic use of coroutines

import asyncio

import time



async def visit_url(url, response_time):

    ""To access the url """"

    await asyncio.sleep(response_time)

    return f"Access {url}, return result"



start_time = time.perf_counter()

task = visit_url('http://wangzhen.com', 2)

asyncio.run(task)

print(f"Elapsed time: {time.perf_counter() -start_time}")

Copy the code
  • Add the async keyword in front of normal functions;
  • Await means to wait at this point for the subfunction to complete and then execute further. (In concurrent operations, give control of the program to the main program and let it assign other coroutines to execute.) Await can only be run in functions with the async keyword.
  • Asynico.run () runs the program
  • This program takes about 2s.

Increase coroutines

Add another task

task2 = visit_url('http://another.com', 3)

asynicio.run(task2)

Copy the code

These two programs took about 5s in total. Doesn’t take advantage of concurrent programming

import asyncio

import time

async def visit_url(url, response_time):

    ""To access the url """"

    await asyncio.sleep(response_time)

    return f"Access {url}, return result"



async def run_task():

    """Collect subtasks"""

    task = visit_url('http://wangzhen.com', 2)

    task_2 = visit_url('http://another', 3)

    await asyncio.run(task)

    await asyncio.run(task_2)



asyncio.run(run_task())

print(f"Elapsed time: {time.perf_counter() -start_time}")

Copy the code

With concurrent programming, the program consumes only 3s, which is task2’s wait time. To use the concurrent programming form, you need to change the code above. The Asyncio. Gather creates 2 subtasks and the program schedules between them when await is present.

async def run_task():

    """Collect subtasks"""

    task = visit_url('http://wangzhen.com', 2)

    task_2 = visit_url('http://another', 3)

    await asynicio.gather(task1, task2)

Copy the code

create_task

In addition to the Gather method, subtasks can also be created using asyncio.create_task.

async def run_task():

    coro = visit_url('http://wangzhen.com', 2)

    coro_2 = visit_url('http://another.com', 3)



    task1 = asyncio.create_task(coro)

    task2 = asyncio.create_task(coro_2)



    await task1

    await task2

Copy the code

The main usage scenarios of coroutines

The main application scenarios of coroutines are IO intensive tasks. Several common application scenarios are summarized as follows:

  • Web requests such as crawlers make heavy use of AIOHTTP
  • File read, aiofile
  • Web framework, AIOHTTP, FastAPI
  • Database query, Asyncpg, Databases