This article was first published on Zhihu

Multithreaded understanding

Multithreading is a way of running multiple tasks simultaneously. For example, in a loop where each loop is treated as a task, we want to start the second loop before the first loop has finished, saving time in this way.

The purpose of this simultaneous running in Python is to maximize the computing power of the CPU, taking advantage of a lot of waiting time. This also means that multithreading will not improve the running time if the program takes time not because of the wait time, but because there are too many tasks to compute for that long.

See the resources below for more information on multithreading understanding

  • Liao Xuefeng tutorial
  • Zhihu answer
  • There are many more explanations that I will not repeat here

Simple to use

So let’s look at the following function

import time

def myfun(a):

time.sleep(1)

a = 1 + 1

print(a)

Copy the code

If we were to run this function 10 times, and its running time is mainly in the second of each sleep, the calculation of 1 + 1 would not take much time. In this case, multithreading can be used to improve efficiency.

Let’s take a look at how long it takes to not use multiple threads versus how long it takes to use multiple threads

Not using multithreading

t = time.time(a)

for _ in range(5) :

myfun()

print(time.time() - t)

Copy the code

The result is 5.002434492111206

Next we use multithreading

from threading import Thread

for _ in range(5) :

th = Thread(target = myfun)

th.start()

Copy the code

So you can actually do it with multiple threads, and you’ll notice that about a second, five twos will come out at the same time, which means that five loops are actually running at the same time, five one-second waits at the same time, and only one second waits.

There are only two steps involved in multithreading

  • withThreadAdd a thread, in this case each loop is a new thread, one thread executes oncemyfunFunction.
  • withstart()Start running the thread. Each thread needs to be explicitly opened in this way to run. Once a thread is started this way, it can continue to run the following program, the next loop, without waiting for it to complete (then a second thread is created, and a third thread is started before it finishes running…).

One thing to note here is that multithreading is inside a loop, so you can’t define a loop and turn it into multithreading from the outside.

Readers may have noticed that time is computed programmatically without multithreading, but not with multithreading. This is because it takes some extra code to calculate the time, and it doesn’t show you the simplest multithreading, so we won’t count the time. Next we’ll look at the use of join() and calculate the time.

The use of the join

The join() method of a thread means that the program will continue until the thread is finished running. Let’s look at the following example

from threading import Thread

t = time.time(a)

for _ in range(5) :

th = Thread(target = myfun)

th.start()

th.join()

print(time.time() - t)

# the results for5.0047078132629395

Copy the code

Join () immediately after start(), meaning that each thread must finish before the next loop. But if you want to calculate multithreaded running time you’re going to use this join()

Let’s first look at the case where join() is not used

from threading import Thread

t = time.time(a)

for _ in range(5) :

th = Thread(target = myfun)

th.start()

print(time.time() - t)

# the results for0.0009980201721191406

Copy the code

It didn’t wait a second to print out the result, and the five twos were printed out after printing this. This is because print(time.time() -t) is the sixth thread in the loop and does not wait for five threads to run. Therefore, it is impossible to obtain the running time of the above 5 threads. We need to use join() to wait for all 5 threads to finish running.

The following code

from threading import Thread

t = time.time(a)

ths = []

for _ in range(5) :

th = Thread(target = myfun)

th.start()

ths.append(th)

for th in ths:

th.join()

print(time.time() - t)

# the results for1.0038363933563232

Copy the code

The THS list is defined above to store these threads. Finally, a loop is used to ensure that each thread has finished running before calculating the time difference.

Join () is not just used in this case. The join() command is added when the one-step code run depends on the completion of the previous code run.

Now that we’ve learned how to use multithreading in general, we can use it in most scenarios. Here are some details

other

(1) Thread name

Let’s just look at the code below

import threading

print(threading.current_thread().getName())

def myfun(a):

time.sleep(1)

print(threading.current_thread().name)

a = 1 + 1

for i in range(5) :

th = threading.Thread(target = myfun, name = 'thread {}'.format(i))

th.start()

# output result

MainThread

thread 0

thread 1

thread 4

thread 3

thread 2

Copy the code

explain

  • threading.current_thread()Represents the current thread and can be callednameorgetName()Get thread name
  • Any process starts a thread by default, with the default nameMainThreadThat is, the main program occupies a thread, which is used laterThreadNew threads are independent of each other, and the main thread does not wait for the rest of the thread to finish running. Not beforejoin()The running time cannot be calculated because the main thread runs out first.
  • ThreadMeans to run this function to start a new thread and add onenameArgument specifies the function thread name, print the thread name inside the function to display herenameParameter value
  • There are two types of printing in a loop. The first kind ofprint(threading.current_thread().name)It isMainThread; The second,print(th.name)It isthread 1Etc.

(2) the Thread function

We used the target name parameter of the Thread function above. Here are its other parameters

  • argsThe specifiedtargetArguments to the corresponding function are passed in as tuples, such asargs = (3, )
  • daemonThe main thread default isFalseIf not specified, inherits the value of the parent thread.TrueIf the main thread finishes running, the thread stops running.FalseThe thread will continue to run until the end of the run, regardless of the main thread. (To see how this works, you need to write code in a py file and run it in CMD, not in jupyter Notebook because there will be more thread interference.)
  • groupIs a parameter reserved for future extensionThreadGroupClass, now useless

(3) the Thread objects

Both threading.thread and threading.current_thread() above create a Thread object that has the following properties and methods

  • getName() .nameGet thread name
  • setName()Setting the thread name
  • start() join()We talked about these two before
  • join()There is atimeoutParameter to wait for the thread to terminate. If it waits longer than this, the code is stopped and continues, but the thread is not interrupted
  • run()This thread also runs, but it must wait until the thread finishes before continuing to execute the following code (if the abovestartallrunIs equivalent to not open multithreading)
  • is_alive()If the thread is not finished running, yesTrueOtherwise,False
  • daemonReturns the value of the threaddaemon
  • setDaemon(True)Set thread ofdaemon

(4)threading

Some directly called variables

  • threading.currentThread(): Returns the current thread variable
  • threading.enumerate(): returns a list of running threads
  • threading.activeCount(): returns the number of running threads, andlen(threading.enumerate())We get the same result

Welcome to my zhihu column

Column home: Programming in Python

Table of contents: table of contents

Version description: Software and package version description