Why can't Python multithreading take advantage of multicore processors

1. Explain the lock globally

Why can’t Python multithreading take advantage of multi-core processors?

A Global Interpreter Lock is a mechanism used by computer programming language interpreters to synchronize threads so that only one thread is executing at any one time.

Even on multi-core processors, interpreters using GIL allow only one thread to execute at a time. Common interpreters using GIL include CPython and Ruby MRI.

As you can see, GIL is not unique to Python, but is a mechanism for interpreted languages to handle multithreading rather than a language feature.

2. The Python interpreter

Python is an interpreter language. Code is executed through an interpreter. Python has multiple interpreters, each developed in a different language, and each interpreter has different characteristics.

Python program interpretation and execution diagram:

CPython

CPython is the mainstream version of the interpreter. Written in C, CPython is the most widely used interpreter and can easily interact with C/C++ libraries, so it is the most popular interpreter.

Jython

A Python interpreter written in the Java language that compiles Python to Java bytecode and then executes it, allowing easy interaction with Java class libraries.

IronPython

Interpret the Python code as. Net platform running bytecode for execution, similar to the Jython interpreter, can be easily and. Net platform class library for interaction. IPython

The interaction is enhanced, but the execution process and functionality are the same as CPython.

PyPy

A compiler that uses just-in-time (JIT) technology and focuses on execution speed, dynamically compiles Python code to improve Python execution speed.

It is important to understand the difference between PyPy and CPython before using PyPy to improve the execution efficiency of a project.

3.CPython threads are not safe

CPython threads are native to the operating system, and on Linux pthreads are completely scheduled to execute by the operating system.

Pthreads themselves are not thread-safe and require the user to implement multithreading safely through locks. Therefore, Python implementation of multithreading under the CPython interpreter must also have the problem of thread insecurity.

This puts GIL in jeopardy in the multi-core era.

4.GIL generates background and challenges

Python was released by Guido Van Rossum in 1989, before computers reached 1G and programs ran on single-core computers. Multicore processors were not developed by Intel until 2005.

Python release timeline:

4.1 Impact of multinucleation on software system

Gordon Moore predicted in 1965 that the number of components per integrated circuit would double every 18 to 24 months, and its applicability was expected to continue until 2015-2020.

Before Moore’s Law failed, software systems could improve their performance by simply relying on hardware advances or take a leap in performance with only a small amount of improvement.

Since 2005, however, the rise in clock rates and transistor numbers has fallen out of sync.

Because of the physical limitations of processor materials, clock rates have stopped growing or even fallen, and processor manufacturers are beginning to pack more execution unit cores into a single chip.

This trend puts increasing pressure on application development and programming language design.

Programmers and language decision makers have to consider how to quickly adapt to multicore hardware to improve software performance and market share for programming languages, and Python is no exception.

4.2 Impact of multinucleation on CPython

In the single-core era, advocating beautiful, clear, simple jido. Van Rossum chose to implement a global mutex at the interpreter level to protect Python objects for single-core CPU utilization, which worked well in the single-core era.

If the GIL is not selected on a single core, then the developer needs to implement task management himself, which is not the best way to improve CPU utilization.

The picture shows Guido, the father of Python. Van Rossum:

But with the advent of multicore, an effective way to use CPU cores efficiently is to use parallelism. Multithreading is a good way to achieve full parallelism, but CPython’s GIL prevents the use of multicore cpus.

4.3 GIL in pain and happy

CPython’s GIL brings convenience to users, and many important packages and language features are developed on top of the GIL.

However, the ubiveness of multi-core cpus and the impact of other languages on Python made GIL seem primitive and crude, and the inability to utilize multi-core processors effectively became a drawback.

5. Problems exposed in GIL in the multi-nuclear era

To understand the impact of GIL on multi-threaded programs, we need to understand the basic operating principles of GIL.

Single-core CPU

CPython’s pThreads are scheduled to execute using the operating system scheduling algorithm.

Every time the Python interpreter executes a certain amount of bytecode or encounters system IO, it forces the release of the GIL and triggers a thread scheduling of the operating system to achieve full utilization of the single-core CPU, with a very short release and re-execution interval on the single-core.

Multi-core CPU

In the multi-core case, one thread releases the GIL after CPU-A finishes executing, and all threads on other cpus compete, but CPU-A may immediately acquire the GIL.

This causes the thread on the other CPU to wake up and watch the thread on CPU-A execute again, waiting until it is switched to the unscheduled state again.

This results in multi-core cpus switching threads frequently, consuming resources, but only one thread can get hold of the GIL and actually execute Python code. As a result, multi-thread execution on multi-core cpus is not as efficient as single-thread execution.

This situation is very similar to the scare phenomenon caused by multiple threads listening on the same port in network programming, only at the CPU level, resulting in more extravagant waste.

6. Actual impact of GIL

I/O intensive

It is helpful that the interpreter performs efficient switching when multithreading on a single-core CPU.

In I/O intensive programs such as web crawlers, the performance of multithreaded programs under GIL control is not as bad as you might expect.

cpu-intensive

GIL is a big problem for CPU-intensive computing programs, because cpu-intensive programs don’t have much waiting, don’t need the interpreter to intervene, and all tasks can only wait for one core, and the rest of the core is idle, so the use of multiple cores is really bad.

7. Discard and optimize the GIL

The GIL has always been controversial, and PEP has tried many times to remove or optimize the GIL, but the complexity of the interpreter itself and the numerous libraries under the GIL make GIL removal a far-fetched idea.

Remove the GIL

In 1999, a free Threading patch for Python 1.5 attempted to implement this idea from Greg Stein.

In this patch, the GIL was completely removed and replaced with a fine-grained lock. However, the removal of the GIL comes at a cost to the execution speed of single-threaded programs.

When executed with a single thread, the speed is reduced by about 40%. Using two threads showed an increase in speed, but despite this increase, the gain did not increase linearly with the number of cores. The patch was rejected and almost forgotten due to the slow implementation.

Multicore was a fantasy back in 1999, but now it is very difficult to remove GIL, and the effect of removing GIL is unknown, but it will be very difficult to go back.

Optimization of GIL

In 2009 Antoine Pitrou implemented a new GIL in Python 3.2 with some positive results.

This is a major change to the GIL, which counted Python instructions to determine when to discard the GIL.

A single Python instruction will contain a lot of work, and in the new GIL implementation, a fixed timeout indicates the current thread to abandon the lock, making switching between threads more predictable.

8. Solutions for GIL defects

Python, as a hot language with a long life, is definitely not going to sit idle in the multi-core era. Even with GIL’s limitations, there are many ways for programs to embrace multicore.

Multiple processes

Python2.6 introduced the MultiProcess library to compensate for the GIL defects in the Threading library. Based on this, multi-process programs are developed. Each process has a separate GIL to avoid GIL competition between multiple processes, thus realizing multi-core utilization, but it also brings some synchronization and communication problems. It’s bound to happen.

Ctypes

The advantage of CPython is its combination with THE C module, so calculations can be transferred by calling the DYNAMIC library of C with the help of Ctypes. The dynamic library of C has no GIL to realize the utilization of multiple cores.

coroutines

Coroutines are also a good tool. Prior to Python3.4 there was no support for coroutines, and there were implementations of tripartites libraries such as gevent and Tornado.

Python3.4 has the asyncio library built in to implement this feature.

9. Summary

The GIL is still the most difficult technical challenge in Python. The GIL problem is not the programming language itself, but other languages simply shift the problem to the user level. Instead, Python’s authors try to shift the problem to the interpreter to present an elegant language to the user.

While the advent of the multicore era has exposed GIL’s flaws, Python decision makers and community developers have taken many other steps to embrace multicore, and it would be unwise to criticize GIL in ignorance.

Just as the relationship of production should adapt to the development of productive forces, it is biased to talk about the merits and demerits of mechanism regardless of historical background. Therefore, GIL should be treated dialectically.

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Why can’t Python multithreading take advantage of multicore processors

1. Explain the lock globally

2. The Python interpreter

3.CPython threads are not safe

4.GIL generates background and challenges

4.1 Impact of multinucleation on software system

4.2 Impact of multinucleation on CPython

4.3 GIL in pain and happy

5. Problems exposed in GIL in the multi-nuclear era

6. Actual impact of GIL

7. Discard and optimize the GIL

8. Solutions for GIL defects

9. Summary

Why can’t Python multithreading take advantage of multicore processors

1. Explain the lock globally

2. The Python interpreter

3.CPython threads are not safe

4.GIL generates background and challenges

4.1 Impact of multinucleation on software system

4.2 Impact of multinucleation on CPython

4.3 GIL in pain and happy

5. Problems exposed in GIL in the multi-nuclear era

6. Actual impact of GIL

7. Discard and optimize the GIL

8. Solutions for GIL defects

9. Summary

Related Posts

Disruptor Practice: Integrate into existing crawler frameworks

Learn springBoot (17) to integrate dubbo, use NACOS as registry, configure center (2)

Spring Cloud Alibaba- Stream limiting component Sentinel recognizes and uses Sentinel to implement interface traffic limiting