Asynchronous I/o
CPU speed is much faster than disk and network IO. In a thread, the CPU executes code extremely fast. However, once it encounters IO operations, such as reading and writing files or sending network data, it needs to wait for the I/O operation to complete before it can proceed to the next operation. This situation is called synchronous IO. During an IO operation, the current thread is suspended, and other code that requires CPU execution cannot be executed by the current thread. Because a single IO operation blocks the current thread and prevents other code from executing, we must use multi-threading or multi-processes to execute code concurrently to serve multiple users. Each user is assigned a thread, and if the thread is suspended due to IO, other users’ threads are not affected. The multi-thread and multi-process models solve the concurrency problem, but the system cannot increase the number of threads indefinitely. It is also expensive for the system to switch threads, so once there are too many threads, CPU time is spent switching threads, and less time is spent actually running the code, resulting in a significant performance degradation. Since the problem we are dealing with is a serious mismatch between the high-speed execution capability of the CPU and the slow speed of the IO device, multithreading and multi-processing are just one way to solve this problem. Another approach to IO problems is asynchronous IO. When code needs to perform a time-consuming IO operation, it simply issues IO instructions, does not wait for the IO result, and then executes other code. After a period of time, when the IO returns a result, the CPU is notified to process it. The asynchronous IO model requires a message loop in which the main thread repeats “read message – process message” over and over again. In the synchronous I/O model, the main thread can only be suspended between “SENDING an I/O request” and “receiving an I/O complete”, but in the asynchronous I/O model, the main thread does not rest, but continues processing other messages in the message loop. Thus, in the asynchronous I/O model, a single thread can process multiple I/O requests at the same time without switching threads. For most IO intensive applications, using asynchronous IO will greatly improve the multitasking capabilities of the system.
coroutines
Coroutines, also known as microthreads, fibers. English name Coroutine. Subroutines, or functions, are called hierarchically in all languages, such as when A calls B, when B calls C, when C returns, when B returns, and finally when A finishes. So the subroutine call is implemented through the stack, and a thread is just executing a subroutine. A subroutine call is always an entry, a return, in a clear order. Coroutines are called differently from subroutines. Coroutines also look like subroutines, but can be broken inside the subroutine and then switched to another subroutine, returning to resume execution at an appropriate time. Note that interrupts in a subroutine to execute other subroutines, not function calls, are somewhat similar to CPU interrupts. For example, subroutines A and B, but there is no call to B from A, so coroutine calls are harder to understand than function calls. It seems that the execution of A and B is A bit like multithreading, but the characteristic of coroutines is that they are executed by one thread. What are the advantages of coroutines compared with multithreading? The biggest advantage is the high execution efficiency of coroutines. Because subroutine switching is not thread switching but controlled by the program itself, there is no overhead of thread switching, and the more threads there are, the greater the performance advantage of coroutines compared to multithreading. The second advantage is that there is no need for multi-threaded locking mechanism, because there is only one thread, there is no conflict of variables written at the same time, in the coroutine control of shared resources without locking, only need to judge the state is good, so the execution efficiency is much higher than multi-threading. Since coroutines are executed in a single thread, how do you leverage a multicore CPU? The simplest method is multi-process + coroutine, which not only makes full use of multi-core, but also gives full play to the high efficiency of coroutine, which can obtain extremely high performance. Python support for coroutines is implemented through generators. After the producer produces the message, it directly switches to the consumer to start the execution. After the consumer completes the execution, it switches back to the producer to continue production with high efficiency:
def consumer(): r = '' while True: n = yield r if not n: return print('[CONSUMER] Consuming %s... ' % n) r = '200 OK' def produce(c): c.send(None) n = 0 while n < 5: n = n + 1 print('[PRODUCER] Producing %s... ' % n) r = c.send(n) print('[PRODUCER] Consumer return: %s' % r) c.close() c = consumer() produce(c)Copy the code
Execution Result:
[PRODUCER] Producing 1...
[CONSUMER] Consuming 1...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 2...
[CONSUMER] Consuming 2...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 3...
[CONSUMER] Consuming 3...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 4...
[CONSUMER] Consuming 4...
[PRODUCER] Consumer return: 200 OK
[PRODUCER] Producing 5...
[CONSUMER] Consuming 5...
[PRODUCER] Consumer return: 200 OK
Copy the code
Notice that the consumer function is a generator. Pass a consumer to produce:
- Start the generator by calling c.end (None);
- Then, once something is produced, switch to consumer execution via c. end(n);
- The consumer uses yield to get the message, process it, and use yield to return the result.
- Produce gets the result of consumer processing and continues to produce the next message.
- Produce decides not to produce, and closes the consumer with c.close(), ending the process.
The whole process is unlocked and executed by a single thread. Produce and consumer collaborate to complete the task, so it is called “coroutine” rather than preemptive multitasking by threads.
asyncio
Asyncio is a standard library introduced in Python version 3.4. It has built-in support for asynchronous IO. Asyncio’s programming model is a message loop. We implement asynchronous IO by getting a reference to an EventLoop directly from the Asyncio module and then throwing the coroutine that needs to be executed into the EventLoop. Use asyncio’s asynchronous network connection to obtain the home page codes of Sina, SOhu and 163 as follows:
import asyncio @asyncio.coroutine def wget(host): print('wget %s... ' % host) connect = asyncio.open_connection(host, 80) reader, Writer = yield from connect header = 'GET/HTTP/1.0\r\nHost: %s\r\n\r\n' % host writer.write(header.encode('utf-8')) yield from writer.drain() while True: line = yield from reader.readline() if line == b'\r\n': break print('%s header > %s' % (host, line.decode('utf-8').rstrip())) # Ignore the body, close the socket writer.close() loop = asyncio.get_event_loop() tasks = [wget(host) for host in ['www.sina.com.cn', 'www.sohu.com', 'www.163.com']] loop.run_until_complete(asyncio.wait(tasks)) loop.close()Copy the code
The result is as follows:
wget www.sohu.com... wget www.sina.com.cn... wget www.163.com... www.sohu.com header > HTTP/1.1 200 OK www.sohu.com header > Content-type: text/ HTML... www.sina.com.cn header > HTTP/1.1 200 OK www.sina.com.cn header > Date: Wed, 20 May 2015 04:56:33 GMT ... www.163.com header > HTTP/1.0 302 Moved Temporarily www.163.com header > Server: Cdn Cache Server V2.0...Copy the code
Coroutine @asyncio.coroutine marks a generator as a Coroutine, and then we throw the coroutine into an EventLoop. The yield from syntax allows us to easily invoke another generator. So instead of waiting for an IO operation, the thread interrupts and executes the next message loop. When the yield from returns, the thread can get the return value from the yield from and proceed to the next line. In the meantime, the main thread does not wait to execute any other executable coroutine in the EventLoop, so three coroutines wrapped in tasks can be executed concurrently by the same thread.
async/await
To simplify and better identify asynchronous IO, new syntactic async and await were introduced starting with Python 3.5 to make Coroutine code more concise and readable. To use the new syntax, there are only two simple substitutions:
- the
@asyncio.coroutine
Replace withasync
; - the
yield from
Replace withawait
.
Rewrite the code from the previous section with the new syntax as follows:
import asyncio async def wget(host): print('wget %s... '% host) connect = asyncio.open_connection(host, 80) reader = await connect header = 'GET/HTTP/1.0\r\nHost: %s\r\n\r\n' % host writer.write(header.encode('utf-8')) await writer.drain() while True: line = await reader.readline() if line == b'\r\n': break print('%s header > %s' % (host, line.decode('utf-8').rstrip())) # Ignore the body, close the socket writer.close() loop = asyncio.get_event_loop() tasks = [wget(host) for host in ['www.sina.com.cn', 'www.sohu.com', 'www.163.com']] loop.run_until_complete(asyncio.wait(tasks)) loop.close()Copy the code
The rest of the code stays the same.
aiohttp
Asyncio implements TCP, UDP, SSL and other protocols. Aiohttp is an HTTP framework based on Asyncio. Install aiohttp:
pip install aiohttp
Copy the code
Then write an HTTP server to handle the following urls:
/
– Home Page Backb'<h1>Index</h1>'
;/hello/{name}
– Returns text based on the URL parameterhello, %s!
.
The code is as follows:
import asyncio from aiohttp import web async def index(request): Sleep (0.5) return web.response (body=b'<h1>Index</h1>') async def hello(request): Await asyncio.sleep(0.5) text = '<h1>hello, %s! </h1>' % request.match_info['name'] return web.Response(body=text.encode('utf-8')) async def init(loop): app = web.Application(loop=loop) app.router.add_route('GET', '/', index) app.router.add_route('GET', '/hello/{name}', Hello) SRV = await loop.create_server(app.make_handler(), '127.0.0.1', 8000) print('Server started at http://127.0.0.1:8000... ') return srv loop = asyncio.get_event_loop() loop.run_until_complete(init(loop)) loop.run_forever()Copy the code
Note that aioHTTP’s initialization function init() is also a coroutine, and loop.create_server() creates TCP services using asyncio. Wish us an early victory over the Novel Coronavirus. Come on, Wuhan! Come on China!