Pay attention to the “water drop and silver bullet” public account, the first time to obtain high-quality technical dry goods. 7 years of senior back-end development, with a simple way to explain the technology clearly.
This article takes about 9 minutes to read.
In Python development, you must have heard GIL, who is often teased by Python programmers that Python’s multithreading is useless because of GIL, so Python cannot use multithreading to improve performance.
But is this really the case?
In this article, let’s look at what Python’s GIL is. And how its presence affects our programs.
What’s GIL?
The Global Interpreter Lock (GIL) stands for Global Interpreter Lock.
In CPython, the global interpreter lock, or GIL, is a mutex that protects access to Python objects, preventing multiple threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.)
Translated into Chinese:
In the CPython interpreter, the global explain lock GIL is a mutex that prevents multiple threads from executing Python bytecode in order to protect access to Python objects. This lock exists mainly because the memory management of the CPython interpreter is not thread-safe. However, the GIL still exists today, and many functions are used to relying on it as a guarantee of execution.
From this definition, we can see several important points:
- GIL exists in the CPython interpreter and is at the interpreter level, not a Python language feature. That is, if you have the ability to implement a Python interpreter yourself, you don’t need to use GIL at all
- The GIL is designed to keep memory management safe by allowing the interpreter to execute Python code with only one thread running at a time
- For historical reasons, many Python projects now rely on the GIL (developers believe That Python is thread-safe and that access to shared resources is not locked when writing code).
What I want to emphasize here is that since Python’s default interpreter is CPython, GIL exists in the CPython interpreter. We usually think of GIL as a Python problem, but this statement is not accurate.
In addition to the CPython interpreter, there are several common Python interpreters:
- CPython: an interpreter developed for the C language, which is officially used by default and is the most widely used
- IPython: An interactive interpreter developed based on CPython, with enhanced interaction and executed exactly the same as CPython
- PyPy: The goal is to speed up execution. Dynamic compilation (not interpretation) of Python code with JIT technology can significantly speed up the execution of code, but the execution result may be different from CPython, with a GIL
- Jython: A Python interpreter that runs on the Java platform and compiles Python code to Java bytecode. It relies on the Java platform and does not have a GIL
- IronPython: Similar to Jython, a Python interpreter running on Microsoft’s.NET platform that compiles Python code into.NET bytecode without a GIL
Although there are many Python interpreters, the most widely used is still the officially provided CPython, which has a GIL by default.
So what’s the problem with GIL? Why do developers always complain that Python multithreading doesn’t make their programs more efficient?
Problems with GIL
To understand GIL’s impact on Python multithreading, let’s take a look at an example.
import threading
def loop() :
count = 0
while count <= 1000000000:
count += 1
# 2 threads execute loop method
t1 = threading.Thread(target=loop)
t2 = threading.Thread(target=loop)
t1.start()
t2.start()
t1.join()
t2.join()
Copy the code
In this example, although we started two threads to execute the loop, we looked at the CPU usage and found that the program could only run to one CPU core, not multiple cores.
That’s the problem with GIL.
The reason for this is that a Python thread that wants to execute a piece of code must acquire the GIL lock before it is allowed to execute, which means that even though we are using multiple threads, only one thread is executing at a time.
But let’s think about it further. Even if there is a GIL, theoretically speaking, if the GIL is released fast enough, how can multithreaded execution be more efficient than single-threaded execution?
But the reality is that multithreading is worse than we thought.
Let’s look at another example, or running a cpu-intensive task, and let’s look at what’s more efficient, two times with one thread or two threads at the same time?
A single thread executes two CPU-intensive tasks:
import time
import threading
def loop() :
count = 0
while count <= 1000000000:
count += 1
Perform 2 CPU-intensive tasks in a single thread
start = time.time()
loop()
loop()
end = time.time()
print("execution time: %s" % (end - start))
# execution time: 89.63111019134521
Copy the code
From the results, the execution took 89 seconds.
Let’s look at two threads executing cpu-intensive tasks at the same time:
import time
import threading
def loop() :
count = 0
while count <= 1000000000:
count += 1
# 2 threads executing CPU-intensive tasks simultaneously
start = time.time()
t1 = threading.Thread(target=loop)
t2 = threading.Thread(target=loop)
t1.start()
t2.start()
t1.join()
t2.join()
end = time.time()
print("execution time: %s" % (end - start))
# execution time: 92.29994678497314
Copy the code
The execution took 92 seconds.
From the execution results, multi-threaded execution is not as efficient as single-threaded execution!
Why does this happen? Let’s see what happened to GIL.
GIL principle
In fact, since Python threads are C pthreads, they are scheduled to execute using the operating system scheduling algorithm.
Code execution in Python 2.x is scheduled based on opcode quantity. In simple terms, every time a certain number of bytecode is executed, or when system IO is encountered, the GIL will be forced to release, and then the operating system will trigger a thread scheduling.
Although optimized in Python 3.x, scheduling is based on fixed time, which means that every time a fixed time bytecode is executed, or when system IO is encountered, the forced release of the GIL triggers the system’s thread scheduling.
However, this kind of thread scheduling results in only one thread running at a time.
And threads in the scheduling, and depends on the SYSTEM’S CPU environment, that is, in the single-core CPU or multi-core CPU, multithreading in the scheduling switch costs are different.
If thread A releases the GIL lock during the execution of multi-threading in A single-cpu environment, then the awakened thread B can immediately get the GIL lock, and thread B can continue the execution seamlessly, as shown in the following figure:
If, on A multi-core CPU, thread A releases the GIL lock after CPU0 finishes executing, threads on other cpus will compete.
However, thread B on CPU0 May immediately acquire the GIL. This will cause other threads on CPU0 to wait until they are switched to the unscheduled state. This will cause the multi-core CPU to make frequent thread switches and consume resources. This condition is also known as “CPU bump.” The whole execution process is shown as follows:
In the figure, the green part indicates that the thread has obtained the GIL and performed effective CPU calculation, while the red part indicates that the awakened thread cannot make full use of the parallel computing capability of CPU because it has not competed for the GIL and can only wait ineffectively.
This is why multithreading on a multi-core CPU is not as efficient as single-threaded or single-core CPUS.
At this point, we can conclude that Python multithreading is not efficient if it is used to run a CPU-intensive task.
Don’t worry. You think this is the end of it?
We also need to consider another scenario: what if multithreading runs not a CPU intensive task, but an IO intensive task?
The answer is that multithreading can significantly improve performance!
The reason for this is simply that IO intensive tasks, which spend most of their time waiting for IO, do not consume CPU resources all the time, so they do not undergo ineffective thread switches like the above application.
For example, if we want to download the data of two web pages, that is, to initiate two network requests, if the single-threaded mode is used, it can only be executed sequentially, where the total waiting time is the sum of the two network requests.
In a two-thread approach, both network requests are sent at the same time and wait for data to return at the same time (IO wait), depending on the time of the longest thread, which is much more efficient than serial execution.
So if you need to run IO intensive tasks, Python multithreading can be efficient.
Why is there GIL?
We have learned that GIL is not efficient for multi-threading scenarios where CPU intensive tasks are handled.
Given GIL’s influence, why was the Python interpreter CPython designed this way?
This is for historical reasons.
Before 2000, in order to improve the performance of computers, various CPU manufacturers made efforts to improve the running frequency of a single CPU. However, in the following years, the ceiling was reached and the performance of a single CPU could not be greatly improved. Therefore, after 2000, The way to improve computer performance is to move towards multi-CPU cores.
In order to make more effective use of multi-core CPU, many programming languages have appeared multithreaded programming, but just because of the existence of multithreading, the problem is that it is difficult to maintain data and state consistency between multithreading.
Python’s designers probably didn’t expect CPU performance improvements to shift to multi-core so quickly when they designed the interpreter, so designing a global lock was the simplest and most economical way to protect the consistency of multithreaded resources at the time.
However, with the advent of the multi-core era, when people tried to split and remove the GIL, they found that a large number of library code and developers already heavily rely on the GIL (Pythonn internal objects are considered to be thread-safe by default, without additional locks during development), so the task of removing the GIL became complicated and difficult to achieve.
Therefore, the GIL exists for more historical reasons. In Python 3, although the GIL was optimized, it was not removed. The Python designers explained that removing the GIL would break existing C extension modules, because these extension modules rely heavily on the GIL. Removing the GIL may result in slower execution than Python 2.
Python has so much historical baggage that it has to carry around, and if it were to start over, the Python designers would have designed it more elegantly.
The solution
Since the existence of GIL can cause so many problems, what should we pay attention to during development to avoid being affected by GIL?
I summarized the following scenarios:
- IO intensive task scenarios, can use multi-threading can improve operation efficiency
- In CPU-intensive task scenarios, multi-threading is not required. Therefore, multi-process deployment is recommended
- Replace the Python interpreter without the GIL, but you need to evaluate in advance whether the results are consistent with CPython
- Write Python C extension module, transfer CPU intensive tasks to C module, but the disadvantage is more complex coding
- Switch to another language 🙂
conclusion
This article focuses on Python GIL related issues.
First, we learned that GIL is at the Python interpreter level and is not a feature of the Python language, which we must not confuse. The presence of a GIL allows Python to execute code in only one thread at a time, in order to ensure safe memory management during execution.
Later, we used an example to observe that The execution efficiency of Python in multi-threaded CPU intensive tasks is even lower than that of single thread. The reason is that in multi-core CPU environment, the existence of GIL will lead to invalid resource consumption in multi-threaded switching, thus reducing the efficiency of program execution.
However, if you use multithreading to run IO intensive tasks, because the thread is more waiting for IO, so do not consume CPU resources, in this case, using multithreading can improve the efficiency of the program.
Finally, we analyzed the reason for the existence of GIL, which was mainly caused by historical problems. Because of the existence of GIL, many Python developers assumed that Python was thread-safe by default, which also indirectly increased the difficulty in removing GIL.
Based on these premises, we generally prefer to deploy Python programs in a multi-process manner to avoid the impact of the GIL.
Any programming language has its advantages and disadvantages, and we need to understand its implementation mechanism and play to its strengths to better serve our needs.
My advanced Python series:
- Python Advanced – How to implement a decorator?
- Python Advanced – How to use magic methods correctly? (on)
- Python Advanced – How to use magic methods correctly? (below)
- Python Advanced — What is a metaclass?
- Python Advanced – What is a Context manager?
- Python Advancements — What is an iterator?
- Python Advancements — How to use yield correctly?
- Python Advanced – What is a descriptor?
- Python Advancements – Why does GIL make multithreading so useless?
Crawler series:
- How to build a crawler proxy service?
- How to build a universal vertical crawler platform?
- Scrapy source code analysis (a) architecture overview
- Scrapy source code analysis (two) how to run Scrapy?
- Scrapy source code analysis (three) what are the core components of Scrapy?
- Scrapy source code analysis (four) how to complete the scraping task?
Want to read more hardcore technology articles? Focus on”Water drops and silver bullets”Public number, the first time to obtain high-quality technical dry goods. 7 years of senior back-end development, with a simple way to explain the technology clearly.