At the meeting in a few days ago, the department bosses back up over the weekend a problems in the project, in the face of a large number of HTTP requests, before it is open to distributed multi-process multi-threaded accept and response, but with no good forecast request level, resulting in a large number of requests are blocked, has affected the online business.
Solution: Of course, multiple servers and multiple threads can alleviate the need, but it is a waste to have multiple servers for occasional spikes
So the big guy came up with this thing, coroutines, and to be honest, I had never heard of it, I knew what a process, a thread, a coroutine was…
Later after the meeting I went to study, share some learning experience to everyone
Process, thread, coroutine
A process is a startup instance of an application, with code and open file resources, data resources, and a separate memory space.
Threads are subordinate to the process and are the actual executor of the program. A process contains at least one main thread, but can also have more child threads, which have their own stack space.
Coroutines are more lightweight than threads. Just as a process can have multiple threads, a thread can have multiple Coroutines.
In simple terms, a process is a factory, with space and assembly lines. Thread is a work line, threads can carry out work communication. Coroutines are workers, working on an assembly line. The work of coroutine is like this: in the whole assembly line, there are multiple tasks to be done, such as conveyor belt and packaging belt. The staff can carry out packaging work when waiting for the conveyor belt to come, but when the conveyor belt comes, they will go back to work on the conveyor belt. That is, two functions can be executed alternately, and while one function is waiting, the other function can be executed first.
The problem with the project mentioned above is that the thread performs the accept -> response is A one-step operation and only waits for the response before accepting the next message. If it is A coroutine, it can accept message B while waiting for the response. This will improve efficiency a lot.
Yield to use
Code from https://www.liaoxuefeng.com/wiki/897692888725344/923057403198272 for reference
import time
def consumer(a):
r = ' '
while True:
n = yield r
if not n:
return
print('[CONSUMER] Consuming %s... ' % n)
time.sleep(5)
r = '200 OK'
def produce(c):
c.next()
n = 0
while n < 5:
n = n + 1
print('[PRODUCER] Producing %s... ' % n)
r = c.send(n)
print('[PRODUCER] Consumer return: %s' % r)
c.close()
Copy the code
Implementations of Python coroutines rely primarily on yield, and the two simple examples above are a producer and a consumer. When a producer produces a message, it sends it to the consumer for consumption. After consumption, the consumer returns the result with the same yield to the producer, who then produces the next message. The result is as follows
3. Use of coroutine GEvent
Install PIP install gevent directly from PIP
def play(name):
for i in range(5):
print('play {} {}'.format(name,i))
time.sleep(1.5)
def play_1(name):
for i in range(5):
print('play_1 {} {}'.format(name ,i))
time.sleep(3)
if __name__=='__main__':
# con = consumer()
# pro = produce(con)
from gevent import monkey
monkey.patch_all()
g1 = gevent.spawn(play,'test1')
g2 = gevent.spawn(play_1,'test2')
gevent.joinall([g1, g2])
Copy the code
Monkey.patch_all () was added to replace threads/sockets in the standard library. This way we can use the socket as usual in the future without changing any code to make it non-blocking. Alternatively, you can import gEvent.socket manually.
It takes about 5 seconds to read this file
def eat(a):
for i in range(5):
print('open_read {}'.format(i))
df = pd.read_excel('test.xlsx')
print('len ',len(df))
time.sleep(2)
def play(name):
for i in range(5):
print('play {} {}'.format(name,i))
time.sleep(1.5)
Copy the code
The result is as follows
Four,
In simple terms, can contain multiple threads within a process running, if it is a multi-core CPU supports multiple threads in parallel, within a single thread can open more collaborators, but anyway, more collaborators is serial, between, that is, no matter what time, only one of the threads space collaborators routine running, other functions are suspended.
The main reason why coroutines are so efficient is that the switching of coroutines is decided between programs, and the switching process is only in their own programs, which is also different from threads, which need to rely on the operating system to switch.
I am a forward ant, hope to move forward together.
If you have a little help, a like is enough, thank you!
Note: if there are any mistakes or suggestions in this blog, please point them out. Thank you very much!!