Python has long been optimized for concurrent programming, known for multithreading with the Thread module and multiprocessing, with the gradual introduction of yield-based coroutines. Recent versions of Python have greatly improved the way coroutines are written, and many of the previous coroutines are no longer officially recommended. If you’ve learned about Python coroutines before, you should check out the latest uses.

Concurrent, parallel, synchronous, and asynchronous

Parallelism refers to a CPU that processes multiple programs at the same time, but only one of them at a time. The core of concurrency is: program switching.

But because the program switching speed is very fast, 1 second can be completely many program switching, the naked eye can not perceive.

Parallelism means that multiple cpus process multiple programs at the same time, and can process multiple programs at the same time.

Synchronization: When I/O operations are performed, the result is returned only after the execution is complete. Asynchronous: An I/O operation can be performed without waiting for the result to be returned.

Coroutines, threads and processes

Multi-processes typically take advantage of multi-core cpus to perform multiple computing tasks simultaneously. Each process has its own independent memory management, so data communication between different processes is troublesome.

Multithreading is the process of creating multiple subtasks on a SINGLE CPU, with other tasks taking over when one subtask is resting. Multithreading is controlled by Python itself. Memory is shared between child threads and no additional data communication mechanism is required. But threads have data synchronization issues, so locks are needed.

The implementation of coroutines is implemented in a thread, which is equivalent to pipelining. Because thread switching is costly, coroutines are preferred for concurrent programming.

. Here’s the comparison:

Basic use of coroutines

This is the basic use of coroutines in Python 3.7, and it is now stable enough that the previous syntax is not recommended.

import asyncio
import time

async def visit_url(url, response_time):
    "" "to access the url "" "
    await asyncio.sleep(response_time)
    return F "access{url}, has returned the result"

start_time = time.perf_counter()
task = visit_url('http://wangzhen.com'.2)
asyncio.run(task)
print(F "Consumption time:{time.perf_counter() - start_time}")
Copy the code
  • 1. Add async keyword in front of ordinary functions;
  • 2, await means to wait at this location for the subfunction to complete and then execute further. (In concurrent operations, give control of the program to the main program and let it assign other coroutines to execute.) Await can only be run in functions with the async keyword.
  • Asynico.run () to run the program
  • 4, this program takes about 2s.

Increase coroutines

Add another task:

task2 = visit_url('http://another.com', 3)
asynicio.run(task2)
Copy the code

These two programs took about 5s in total. Doesn’t take advantage of concurrent programming. Complete code:

import asyncio
import time

async def visit_url(url, response_time):
    "" "to access the url "" "
    await asyncio.sleep(response_time)
    return F "access{url}, has returned the result"

async def run_task(a):
    """ Collection subtask """
    task = visit_url('http://wangzhen.com'.2)
    task_2 = visit_url('http://another'.3)
    await asyncio.run(task)
    await asyncio.run(task_2)

asyncio.run(run_task())
print(F "Consumption time:{time.perf_counter() - start_time}")
Copy the code

With concurrent programming, the program consumes only 3s, which is task2’s wait time. To use the concurrent programming form, you need to change the code above. The Asyncio. Gather creates 2 subtasks and the program schedules between them when await is present.

async def run_task(a):
    """ Collection subtask """
    task = visit_url('http://wangzhen.com'.2)
    task_2 = visit_url('http://another'.3)
    await asynicio.gather(task1, task2)
Copy the code

create_task

In addition to the Gather method, subtasks can also be created using asyncio.create_task.

async def run_task(a):
    coro = visit_url('http://wangzhen.com'.2)
    coro_2 = visit_url('http://another.com'.3)

    task1 = asyncio.create_task(coro)
    task2 = asyncio.create_task(coro_2)

    await task1
    await task2
Copy the code

The main usage scenarios of coroutines

The main application scenarios of coroutines are IO intensive tasks. Several common application scenarios are summarized as follows:

  • Web requests such as crawlers make heavy use of AIOHTTP
  • File read, aiofile
  • Web framework, AIOHTTP, FastAPI
  • Database query, Asyncpg, Databases

Further Learning directions (next article)

  • When do you use coroutines, when do you use multithreading, when do you use multiprocessing
  • The future object
  • The underlying API of Asyncio
  • loop
  • Trio third-party library usage

reference

  • Talkpython course source code
  • trio
  • Man to learn
  • realpython
  • Fear and Awaiting in Async by David Beazley