This article was first published on Zhihu
Multithreaded understanding
Multithreading is a way of running multiple tasks simultaneously. For example, in a loop where each loop is treated as a task, we want to start the second loop before the first loop has finished, saving time in this way.
The purpose of this simultaneous running in Python is to maximize the computing power of the CPU, taking advantage of a lot of waiting time. This also means that multithreading will not improve the running time if the program takes time not because of the wait time, but because there are too many tasks to compute for that long.
See the resources below for more information on multithreading understanding
- Liao Xuefeng tutorial
- Zhihu answer
- There are many more explanations that I will not repeat here
Simple to use
So let’s look at the following function
import time
def myfun(a):
time.sleep(1)
a = 1 + 1
print(a)
Copy the code
If we were to run this function 10 times, and its running time is mainly in the second of each sleep, the calculation of 1 + 1 would not take much time. In this case, multithreading can be used to improve efficiency.
Let’s take a look at how long it takes to not use multiple threads versus how long it takes to use multiple threads
Not using multithreading
t = time.time(a)
for _ in range(5) :
myfun()
print(time.time() - t)
Copy the code
The result is 5.002434492111206
Next we use multithreading
from threading import Thread
for _ in range(5) :
th = Thread(target = myfun)
th.start()
Copy the code
So you can actually do it with multiple threads, and you’ll notice that about a second, five twos will come out at the same time, which means that five loops are actually running at the same time, five one-second waits at the same time, and only one second waits.
There are only two steps involved in multithreading
- with
Thread
Add a thread, in this case each loop is a new thread, one thread executes oncemyfun
Function. - with
start()
Start running the thread. Each thread needs to be explicitly opened in this way to run. Once a thread is started this way, it can continue to run the following program, the next loop, without waiting for it to complete (then a second thread is created, and a third thread is started before it finishes running…).
One thing to note here is that multithreading is inside a loop, so you can’t define a loop and turn it into multithreading from the outside.
Readers may have noticed that time is computed programmatically without multithreading, but not with multithreading. This is because it takes some extra code to calculate the time, and it doesn’t show you the simplest multithreading, so we won’t count the time. Next we’ll look at the use of join() and calculate the time.
The use of the join
The join() method of a thread means that the program will continue until the thread is finished running. Let’s look at the following example
from threading import Thread
t = time.time(a)
for _ in range(5) :
th = Thread(target = myfun)
th.start()
th.join()
print(time.time() - t)
# the results for5.0047078132629395 秒
Copy the code
Join () immediately after start(), meaning that each thread must finish before the next loop. But if you want to calculate multithreaded running time you’re going to use this join()
Let’s first look at the case where join() is not used
from threading import Thread
t = time.time(a)
for _ in range(5) :
th = Thread(target = myfun)
th.start()
print(time.time() - t)
# the results for0.0009980201721191406 秒
Copy the code
It didn’t wait a second to print out the result, and the five twos were printed out after printing this. This is because print(time.time() -t) is the sixth thread in the loop and does not wait for five threads to run. Therefore, it is impossible to obtain the running time of the above 5 threads. We need to use join() to wait for all 5 threads to finish running.
The following code
from threading import Thread
t = time.time(a)
ths = []
for _ in range(5) :
th = Thread(target = myfun)
th.start()
ths.append(th)
for th in ths:
th.join()
print(time.time() - t)
# the results for1.0038363933563232
Copy the code
The THS list is defined above to store these threads. Finally, a loop is used to ensure that each thread has finished running before calculating the time difference.
Join () is not just used in this case. The join() command is added when the one-step code run depends on the completion of the previous code run.
Now that we’ve learned how to use multithreading in general, we can use it in most scenarios. Here are some details
other
(1) Thread name
Let’s just look at the code below
import threading
print(threading.current_thread().getName())
def myfun(a):
time.sleep(1)
print(threading.current_thread().name)
a = 1 + 1
for i in range(5) :
th = threading.Thread(target = myfun, name = 'thread {}'.format(i))
th.start()
# output result
MainThread
thread 0
thread 1
thread 4
thread 3
thread 2
Copy the code
explain
threading.current_thread()
Represents the current thread and can be calledname
orgetName()
Get thread name- Any process starts a thread by default, with the default name
MainThread
That is, the main program occupies a thread, which is used laterThread
New threads are independent of each other, and the main thread does not wait for the rest of the thread to finish running. Not beforejoin()
The running time cannot be calculated because the main thread runs out first. Thread
Means to run this function to start a new thread and add onename
Argument specifies the function thread name, print the thread name inside the function to display herename
Parameter value- There are two types of printing in a loop. The first kind of
print(threading.current_thread().name)
It isMainThread
; The second,print(th.name)
It isthread 1
Etc.
(2) the Thread function
We used the target name parameter of the Thread function above. Here are its other parameters
args
The specifiedtarget
Arguments to the corresponding function are passed in as tuples, such asargs = (3, )
daemon
The main thread default isFalse
If not specified, inherits the value of the parent thread.True
If the main thread finishes running, the thread stops running.False
The thread will continue to run until the end of the run, regardless of the main thread. (To see how this works, you need to write code in a py file and run it in CMD, not in jupyter Notebook because there will be more thread interference.)group
Is a parameter reserved for future extensionThreadGroup
Class, now useless
(3) the Thread objects
Both threading.thread and threading.current_thread() above create a Thread object that has the following properties and methods
getName() .name
Get thread namesetName()
Setting the thread namestart() join()
We talked about these two beforejoin()
There is atimeout
Parameter to wait for the thread to terminate. If it waits longer than this, the code is stopped and continues, but the thread is not interruptedrun()
This thread also runs, but it must wait until the thread finishes before continuing to execute the following code (if the abovestart
allrun
Is equivalent to not open multithreading)is_alive()
If the thread is not finished running, yesTrue
Otherwise,False
daemon
Returns the value of the threaddaemon
setDaemon(True)
Set thread ofdaemon
(4)threading
Some directly called variables
threading.currentThread()
: Returns the current thread variablethreading.enumerate()
: returns a list of running threadsthreading.activeCount()
: returns the number of running threads, andlen(threading.enumerate())
We get the same result
Welcome to my zhihu column
Column home: Programming in Python
Table of contents: table of contents
Version description: Software and package version description