In short, there are three mechanisms for memory management in Python
1) Reference counting
2) Recycling
3) Memory pool
Let’s take a closer look at these three management mechanisms
1, reference count:
Reference counting is a very efficient memory management tool. When a Pyhton object is referenced, its reference count increases by one, when it is no longer referenced, it decreases by one, and when the reference count equals zero, the object is deleted.
2. Garbage collection (this is an important point) :
① Reference count
Reference counting is also a garbage collection mechanism, and is one of the most intuitive and simple garbage collection techniques.
At the heart of every object in Python is a PyObject structure that has an internal reference count, ob_refcnt, when an object in Python has a reference count of zero. There are no references to the object, and the object is garbage to be collected.
For example, when an object is created, its reference count increases by 1. When the object is referenced, its reference count increases, and when the referenced object is deleted, its reference count decreases. Until it reaches zero, at which point the garbage collection mechanism will collect it. But once circular references occur, we have to take a new approach.
② Mark clearing
Tag clearing is used to solve the problem of circular references, which can only be created in container objects such as dictionaries, progenitors, lists, etc. First, to keep track of objects, we need to maintain two additional Pointers for each container object, which are used to form a linked list of container objects. The Pointers point to the first and the second container objects, so that the object’s circular reference can be removed, and the valid count of the two objects can be obtained.
Code real chestnut
QA: Why these two lists
The reason for splitting into two linked lists is based on the following consideration: Currently, unreachable objects may exist in the root list, which are directly or indirectly referenced. These objects cannot be reclaimed. Once such objects are found in the tagging process, they are moved from the unreachable list to the root list. When the tag is complete, all remaining objects in the Unreachable list are truly garbage objects, and subsequent garbage collection is restricted to the unreachable list.
Recycling by generation
To understand the sorting collection, we must first understand the GC threshold, the so-called threshold is a critical point value.
As your program runs, the Python interpreter keeps track of newly created objects, as well as objects that are released because the reference count is zero. In theory, create == release quantity should look like this. However, if there is a cyclic reference, it must be create > free number. When the difference between create and free number reaches the specified threshold, the generation recycle mechanism will come into play.
The idea of generation recycling divides objects into three generations (generation 0,1,2)
0 represents the child object,
1 represents the youth target,
2 represents the elderly.
According to the weak generation hypothesis (younger objects are more likely to die, and older objects generally live longer).
The new object is put into generation 0, and if it survives a gc garbage collection in generation 0, it is put into generation 1 (it is upgraded). If an object in generation 1 survives a gc garbage collection in generation 1, it is placed in generation 2.
Since the last generation 0 GC, if the number of allocated objects minus the number of freed objects is greater than Threshold0, then objects in generation 0 are checked for GC garbage collection.
If generation 0 has been collected more times than ThresholD1 since the last generation 1 GC, objects in generation 1 are checked for GC garbage collection.
Objects in generation 2 are checked for GC garbage collection if generation 1 has been collected more times than Threshold2 since the last generation 2 GC.
The threshold triggered by each generation of gc garbage collection can be set yourself.
3. Memory pools
- Python’s memory structure is pyramidal, with the operating system operating at layers -1 and -2
- Layer 0 is where malloc, free and other memory allocation and freeing functions in C operate
- Layers 1 and 2 are memory pools with Python interface functions, implemented by PyMem_Malloc, that allocate memory directly when objects are smaller than 256K
- Layer 3 is the top layer, where we operate directly on Python objects
Python performs a lot of malloc and free operations during runtime, and frequently switches between user and core state, which severely affects Python execution efficiency. To speed up Python’s execution efficiency, Python introduced a memory pool mechanism to manage the allocation and release of small chunks of memory.
4, tuning means
1. Manual garbage collection
2. Avoid circular references (manually uncircle references and use weak references)
3. Increase the garbage collection threshold
If my article is helpful to you, please like duck