This article is based on the Python memory management mechanism.

Reference counter mechanism

The reference count is +1 when an object is referenced, -1 when the object is no longer referenced, or the object that references it is released, and 0 when the object’s reference count is released.

Use sys. getrefCount (obj) to view the current reference count of an object. In Python, when an object is passed into a function, there are two objects inside the function that refer to it. But sys. getrefCount (obj) is special and is usually referenced only once.

class Person:
    pass

def log(obj):
    # obj += 2
    print(sys.getrefcount(obj))  # obj += 1

p = Person()  # p = 1
log(p)  # p = 4

print(sys.getrefcount(obj))  # p = 2
Copy the code

The object is disconnected from the function object when it leaves the scope of the function, so the reference count for p is 2.

A circular reference

Simply put, an object should be released when it is no longer in use, but it cannot be released when the reference count still exists after the object has been deleted.

class Person:
    def __del__(self):
        print("Person({0}) freed".format(id(self)))

class Dog:
    def __del__(self):
        print("Dog({0}) is released".format(id(self)))

p = Person()  # p = 1
dog = Dog()  # dog = 1

# loop reference
p.pet = dog  # dog = 2
dog.master = p  # p = 2

__del__() is not called until the program ends
# Because of the circular reference, p, dog cannot really be removed per se, only syntactically.
delP, dog# p, dog = 1, 1
Copy the code

At the syntactic level, p and dog are no longer usable after they are deleted, and cannot be found through pet and master properties of P and dog. Therefore, p and dog are called reachable references, and PET and master are called unreachable references. In other words, after p and dog are deleted, dog and P referenced by pet and master are still in memory, but they cannot be accessed by normal means. Therefore, p and dog objects cannot be freed from memory.

When deled objects still have reference counts, the reference counter mechanism cannot actually reclaim them from memory, resulting in memory leaks caused by circular references.

"" "error! Undefined p, dog print(p) print(dog) ""
Copy the code

Garbage collection mechanism

Python has two memory management mechanisms, reference counters and garbage collection. The reference counter mechanism performs better than the garbage collection mechanism, but does not collect circular references. Therefore, the main purpose of the garbage collection mechanism is to find circular references and release related objects from objects that have not been released after experiencing the reference counter mechanism.

The underlying mechanism of garbage collection (how do I find circular references?)

  1. Collect all theContainer object ( list , dict , tuple , customClass..), referenced by a two-way linked list;
  2. For each container object, pass a variablegc_refsTo record the current corresponding reference count;
  3. For each container object, find the container object it refers to and count the container object’s reference to -1;
  4. After Step 3, if a container object has a reference count of 0, it can be reclaimed. It must be “circular references” that keep it alive.

Generational recycling (how can I improve the performance of finding circular references?)

If a program creates many objects, and each object has to participate in the detection process, it can be very costly. Based on this problem, Python makes an assumption that objects with larger lives have longer lives.

If an object is tested 10 times without releasing it, it is assumed that it must be very long-lived, and the frequency of the object is reduced.

Generational detection (a set of detection mechanisms designed based on hypotheses)

  1. By default, when an object is created, it belongs to generation 0.
  2. If the generation of garbage is still alive after recycling, it is divided into the next generation;

Sequence of garbage collection cycles

  • After “garbage collection” of generation 0 for a certain number of times, the collection of generation 0~1 is triggered.
  • Generation 1 “garbage collection” a certain number of times, trigger generation 0 to 2 collection.

As for the generation collection mechanism, its main function is to reduce the frequency of garbage detection. Strictly speaking, in addition to this mechanism, there is another condition that limits it, and that is, in the garbage collector, it is checked when “number of objects added – number of objects destroyed = specified threshold”.

Trigger garbage collection

  1. Automatic recovery

    The trigger condition is that the garbage collection mechanism is enabled (enabled by default) and the threshold of garbage collection is reached.

    It is important to note that the trigger does not examine all objects, but rather generational collection.

  2. Manual reclamation (default: 0 to 2)

    Simply run gc.collect(n), where n can be 0 or 2 to collect garbage of generation 0 to n.

The gc module

The GC module can view or modify some information in the garbage collector.

import gc
Copy the code
  • gc.isenabled()

    Check whether the garbage collector mechanism is enabled.

  • gc.enable()

    Enable the garbage collector mechanism (by default).

  • gc.disable()

    Turn off the garbage collector mechanism.

  • gc.get_threshold()

    Gets the threshold that triggers the execution of garbage detection. The return value is a tuple (threshold, n1, n2).

    • threshold

      When the number of newly added objects and the number of destroyed objects is equal to threshold, garbage detection is performed once.

    • n1

      Indicates that when the garbage detection of generation 0 reaches n1 times, garbage collection of generation 0 to 1 is triggered.

    • n2

      Indicates that garbage collection of generation 1 or 2 is triggered when garbage detection of generation 1 reaches N2 times.

  • gc.set_threshold(1000, 15, 15)

    Change the garbage detection frequency. Typically, these values are increased for program performance.

Test automatic collection 1

import gc

Automatic collection is triggered when # "number of objects created - number of objects destroyed = 2".
gc.set_threshold(2.10.10)

class Person:
    def __del__(self):
        print(self, "Set free")

class Dog:
    def __del__(self):
        print(self, "Set free")

p = Person()  # p = 1
dog = Dog()  # dog = 1

# loop reference
p.pet = dog  # dog = 2
dog.master = p  # p = 2

Create a Person class to test that the program can trigger automatic collection after the object is deleted.
p2 = Person()

__del__() is not called until the program ends.
del p
del dog
Copy the code

Three objects were created, one object was destroyed, and 3-1=2. In theory, automatic collection should be triggered at this point, but the __del__() function is not called until the end of the program. Why is this?

To explain this problem, it is important to understand why garbage detection has a qualification of “number of objects added – number of objects destroyed = specified threshold”.

This is because when objects are left in memory and cannot be freed, it is usually because too many objects have been created and not destroyed in time.

According to this conclusion, you can set up a mechanism for the program to detect garbage collection when “created objects” exceed “destroyed objects” by or equal to the “specified threshold”, otherwise no detection will be triggered.

When an object is destroyed, it appears that there will be one less condition to reach the specified threshold, and no further detection is necessary.

So strictly speaking, this qualification should be changed to: when creating objects, “number of objects added – number of objects destroyed = specified threshold”, trigger garbage detection.

Now that you know this, you can see why this object can’t be released. Three objects are created first, then del p, del dog, and garbage detection is not triggered when the destruction operation is performed, so the objects are not released.

Pay attention to

This conclusion is my own speculation, but it may not be the case. I also thought for a long time why not release the object, and finally came up with a more reasonable explanation.

Test automatic recycle 2

import gc
gc.set_threshold(2.10.10)

class Person:
    def __del__(self):
        print(self, "Set free")

class Dog:
    def __del__(self):
        print(self, "Set free")

p = Person()  # p = 1
dog = Dog()  # dog = 1

# loop reference
p.pet = dog  # dog = 2
dog.master = p  # p = 2

Try to see if the real object is reclaimed after the reachable reference is removed.
del p, dog

Create a Person class to test that the program can trigger automatic collection after the object is deleted.
p2 = Person()
print("p2 =", p2)

print("-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the end -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --")

Person object at 0x0000000002C28190 > released <__main__.Dog object at 0x0000000002CF33d0 > Released P2 = < __main__ Person object at 0 x0000000002cf3350 > -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the end -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- < __main__. Person Object at 0x0000000002CF3350 > Released """
Copy the code

A total of 5 objects are created, 3 objects are destroyed, 5-3=2, triggering automatic detection. P and g have been destroyed (the real objects are still in memory), so we find the objects they reference, count -1, and p and dog are freed.

Note: it ispdogTo be released first,p2Released at the end of the procedure.

Manual recovery

import gc

class Person:
    def __del__(self):
        print(self, "Set free")

class Dog:
    def __del__(self):
        print(self, "Set free")

p = Person()  # p = 1
dog = Dog()  # dog = 1

# loop reference
p.pet = dog  # dog = 2
dog.master = p  # p = 2

del p  # p = 1
del dog  # dog = 1

# Execute garbage detection on the program (regardless of whether the collection mechanism is enabled), manually reclaim memory.
gc.collect()

# <__main__.Person object at 0x109CB0110 > was freed
# <__main__.Dog object at 0x109CB0190 > is released
Copy the code

A weak reference

import weakref
import sys

class Person:
    def __del__(self):
        print(self, "Set free")

class Dog:
    def __del__(self):
        print(self, "Set free")

p = Person()  # p = 1
dog = Dog()  # dog = 1

p.pet = dog  # dog = 2
# weakref.ref does not strongly reference the specified object (that is, does not increase the reference count).
dog.master = weakref.ref(p)  # p = 1

The count of the object referenced by # p when it is completely destroyed is -1.
del p  # p = 0, dog = 1
del dog  # dog = 0

# <__main__.Person object at 0x109CB0110 > was freed
# <__main__.Dog object at 0x109CB0190 > is released
Copy the code

To see if the count of an object referred to when it is destroyed is -1, we do an experiment to look at the dog reference count it points to when P is destroyed.

p.pet = dog  # dog = 2
dog.master = weakref.ref(p)  # p = 1

del p  # p = 0, dog = 1

""" See if the dog count it references when P is destroyed is used by -1 sys.getrefcount to get the current reference count of an object, returning a value 1 more than the actual value. "" "
print(sys.getrefcount(dog))  # 2

del dog  # dog = 0
Copy the code

When p is destroyed, it means that in the statement p.pet = god, the preceding p and p.pet no longer exist, leaving only = dog, which is empty and not referenced by any object, so dog’s reference count is -1.

Under strong references, dog’s reference count remains the same when P is destroyed.

p.pet = dog  # dog = 2
dog.master = p  # p = 2

del p  # p = 1, dog = 2
print(sys.getrefcount(dog))  # 3, the actual value is 2.
del dog  # dog = 1
Copy the code

To weakly reference objects in a collection, use weakRef.weak… .

Weak objects in the referenced dictionary
# pets = weakref.WeakValueDictionary({"dog": d1, "cat": c1})
Copy the code

Manually break circular references

class Person:
    def __del__(self):
        print(self, "Set free")

class Dog:
    def __del__(self):
        print(self, "Set free")

p = Person()  # p = 1
dog = Dog()  # dog = 1

p.pet = dog  # dog = 2
dog.master = p  # p = 2

""" Manually breaking the circular reference before deleting This means manually breaking the reference between P.pet and dog, counting naturally -1 when dog is no longer referenced by P. "" "
p.pet = None
del p  # p = 0, dog = 1
del dog  # dog = 0
Copy the code