background

Multithreading plays an important role in every computer language. What about multithreading? In real life, for example, we can watch TV shows while cooking, or hum songs while taking a shower. Simple things, we’re naturally capable of being half-hearted, and that’s something our brains are programmed to do.

However, for complex and precise homework, we may still have to do the same, not at the same time, otherwise you try to tell a joke to others when they are reciting the text.

As for multithreading in computers, the same is true. Because the computer operation speed is very fast, and the use of time slice technology and a variety of scheduling algorithms, can ensure that each program can be executed, the delay is very short, we can not feel it. For example, if several programs are executed more than 200 times in a single second, we would assume that each program is uninterruptedly running all the time.

So let’s think about the question why do crawlers need multiple threads? One of the main reasons is efficiency. For example, we used to crawl a data is synchronous. Like crawling pictures, we must be crawling through one and crawling through the next. Multithreading allows you to crawl multiple pages at the same time. But you should also know that every coin has two sides. It’s not always good to go too fast.

Creation of multiple threads

  • Created by a function
Import time import threading def demo(): print('hello ') if __name__ == '__main__': for I in range(5): t = threading.Thread(target=demo) time.sleep(1) t.start()Copy the code
  • Create by class
import threading
class A(threading.Thread):

    def run(self) -> None:
        for i in range(5):
            print(i)

if __name__ == '__main__':
    a = A()
    a.start()
Copy the code

Viewing the number of threads

To check the number of threads, we use the enumerate() method to demonstrate this in code

def demo1():
    for i in range(5):
        print('demo1--%d'%i)
        time.sleep(1)

def demo2():
    for i in range(10):
        print('demo2--%d' % i)
        time.sleep(1)

def main():
    t1 = threading.Thread(target=demo1)
    t2 = threading.Thread(target=demo2)
    t1.start()
    t2.start()
    while True:
        print(threading.enumerate())
        if len(threading.enumerate()) <= 1:
            break
        time.sleep(1)

if __name__ == '__main__':
    main()
Copy the code

Running effect

We can see that there are three threads when thread 1 doesn’t die: thread 1, thread 2, and the main thread. When a thread dies there’s only thread two and the main thread left. Finally thread two dies, and there’s only one main thread left.

Execution and creation of child threads

Import time import threading # When start() is called, the thread will be created and def demo() will be executed: print('demo--%d'%i) time.sleep(1) def main(): Print (threading.enumerate()) T1 = threading.thread (target=demo) print(threading.enumerate()) t1.start() # Create and execute a Thread print(threading.enumerate()) if __name__ == '__main__': main()Copy the code

The results

It is obvious from the execution results that the start() method is the one that creates and executes the thread

Resource contention between threads

So what’s going on here? It’s actually easier to understand. When multiple threads in a process read a block of memory at the same time, one or more threads modify the block of memory at the same time. This can lead to unexpected results because errors caused by thread insecurity are often very difficult to detect

import threading
import time

num = 0
def demo1(nums):
    global num
    for i in range(nums):
        num += 1
    print('demo1-num-%d'%num)

def demo2(nums):
    global num
    for i in range(nums):
        num += 1
    print('demo2-num-%d' % num)


def main():
    t1 = threading.Thread(target=demo1,args=(1000000,))
    t2 = threading.Thread(target=demo2,args=(1000000,))
    t1.start()
    t2.start()
    time.sleep(3)
    print('main-num-%d' % num)

if __name__ == '__main__':
    main()
Copy the code

The execution result

We found that the results were problematic. What was the cause of the problem? The CPU handles the two threads in a rain – and – rain fashion. Maybe in our thread 1, num has just +1 before we have time to assign. I’m going to do thread 2. When thread 2 performs num+1, it performs the unfinished operation of thread 1. So we have a case of value coverage. So the result that we print now is not correct.

So how do we solve this? We can solve this problem by using thread locks

import threading import time num = 0 lock = threading.Lock() def demo1(nums): Global Num # acquire() for I in range(nums): Release () print('demo1-num-%d'%num) def demo2(nums): global num lock.acquire() for I in range(nums): num += 1 lock.release() print('demo2-num-%d' % num) def main(): t1 = threading.Thread(target=demo1,args=(1000000,)) t2 = threading.Thread(target=demo2,args=(1000000,)) t1.start() t2.start() time.sleep(3) print('main-num-%d' % num) if __name__ == '__main__': main()Copy the code

Running effect

This perfectly solves the problem of competing for resources between threads. This article will give you a quick introduction to Python multithreading. For the follow-up content can continue to pay attention to the update of the article