The introduction
This is the fifth day of my participation in the November Gwen Challenge. Check out the details: The last Gwen Challenge 2021
Knowledge is not isolated, and should not be isolated, but you have to try to find connections between them to make better use of it, and that’s what I encourage myself to say to you.
Today we talk about the garbage collection for the basis of the design theory, this is to know the garbage collection algorithm, the understanding of the mechanism of GC in the premise, is also carries on the JVM tuning each programmer required course, you want it to help you do, you must first understand what it needs, after reading this article, you can realize recycling the basis of the design principle of the algorithm.
Generational collection theory
Most of our garbage collectors are designed in accordance with generational collection theory, which is a law that people discover based on the behavior of real programs. This law is related to objects, and it has two hypotheses:
- The vast majority of objects are ephemeral
- If an object can survive multiple GCS, the harder it is to die.
Imagine if we could divide objects into generations and put the new ones in the new generation because they die quickly, so we only needed to mark a few surviving objects and recycle them at a low cost, and then put together objects that had been GC multiple times and let the VM recycle them at a very low frequency. In this way, both the time cost of garbage collection and the efficient use of memory space are taken into account.
This allows us to develop Full GC, Young GC, Major GC, and garbage collection algorithms that match the survival characteristics of objects in each region: mark-copy, mark-clean, and mark-clean.
Cross-generational citation hypothesis
Despite the convenience of generational collection theory for memory management, there is an obvious problem with partitioning Java regions: objects are not isolated, and there are cross-generation references between objects. In this way, in addition to traversing GC ROOTS, objects in the old age should also be traversed before garbage collection for the new generation to ensure the accuracy of the reachable analysis results. However, this undoubtedly causes performance burden for memory collection.
So we need to update a across generations reference hypothesis, which is across generations reference hypothesis, we can assume that two objects across generations between the reference, then the relationship between them is either born with, or to die, because if it is old that the object, the object reference of the young generation that will lead to the object of the young generation is hard to die, That would necessarily go into the old age after multiple GCS, so cross-generation references would not exist.
Therefore, we don’t need to scan the entire old generation for a few generations, we just need to add a global data structure for the young generation: The memory set is used to record which patches of the old era have cross-generation references. Then, when a Minor GC occurs, we simply add the corresponding regions with cross-generation references to the GC Roots for scanning, so that we don’t have to scan the whole old era.
Object decision method
Reference counting method
Algorithm principle
The algorithm principle of reference counting method is relatively simple. If there is an object reference, the number of times will be updated in the position allocated by the object header for reference count. If there is an object reference, the number of times will be added forward; if there is no reference, the number of times will be subtracted backward.
For example: I create two objects A and B, and the objects that point to them are A and B,
And then, I change the reference relationship, I change the reference relationship that used to point to A to point to B
Reference counting method when implemented, if the current object reference to 0, is to recycle, but will find all the objects, and it is their reference count minus 1, if there are objects also recycle when reduced to 0, but the reference counting method could not handle the circular reference. In fact this is also very good understanding, reference counting method are all external references, That is, when an external object is lost, the reference count is reduced by 1, but the internal object, because of the logic of circular references, will always have a reference count of 1 no matter how you reduce it, causing a memory leak.
Accessibility analysis
Reachability analysis means that we have a GC Roots, and then we start from this object, and we search for a series of reference chains, and objects on these chains that have object relationships prove that the object is reachable, and conversely, objects that are not reachable, prove that the object should be recycled.
In the Java architecture, GC Roots objects are fixed as follows: objects referenced in the virtual machine stack objects referenced in the method area static properties objects in the method area constants referenced in the local method stack JNI referenced objects inside the Java Virtual machine all objects held by synchronous locks