This article has participated in the third “topic writing” track of the Denver Creators Training Camp. For details, check out: Digg Project | Creators Training Camp third is ongoing, “write” to make a personal impact.
Three kinds of garbage collection algorithms in Java JVM are introduced in detail: tag clearing algorithm, copy algorithm, tag collation algorithm and generation partition collection algorithm.
Java’s garbage collection algorithm does not use reference counting to determine garbage, but is based on reachability garbage analysis algorithm, resulting in several common garbage collection algorithms. Basically, there are mainly mark-clearing algorithm, copy algorithm, mark-sorting algorithm, etc., in addition, there are generation, partition and other comprehensive algorithms algorithm. For garbage collectors, see this article: An in-depth understanding of the seven JVM garbage collectors in Java.
1 Basic Algorithm
1.1 Mark-Sweep Algorithm
1.1.1 principle
Mark-sweep algorithm is the most basic collection algorithm and the ideological basis of modern garbage collection algorithm. It is divided into two stages: the marking stage and the cleaning stage. In the marking phase, all reachable objects from the root node are marked first through the root node. Therefore, unmarked objects are unreferenced junk objects. Then, in the purge phase, all unmarked objects are cleared.
1.1.2 the pros and cons
Advantages:
- The token clearing algorithm solves the problem of circular references in the reference counting algorithm (which is based on the reachability analysis algorithm). Objects that are not referenced from the root node are reclaimed.
- This algorithm is relatively simple, and many subsequent algorithms are improved based on this algorithm.
Disadvantages:
- It is inefficient, with both mark and clear actions traversing all objects and stopping the application (STW) during GC, which is a poor experience for highly interactive applications.
- The memory cleared by the mark clearing algorithm is more fragmented, because the objects to be reclaimed may exist in various corners of the memory, so the cleared memory is not coherent. This can result in the inability to find sufficient contiguous memory and having to trigger GC actions early when large objects need to be allocated later in the program’s run.
1.2 Copying Algorithms
1.2.1 principle
The core of the replication algorithm is to divide the original memory space into two parts, and use only one part at a time. During garbage collection, the object being used is copied to another memory space, and then the memory space is emptied, and the roles of the two memory Spaces are swapped to complete garbage collection.
If there are many junk objects in the memory, fewer objects need to be copied. In this case, this method is suitable and efficient; otherwise, it is not suitable.
1.2.2, advantages and disadvantages of
Advantages:
- In the case of a large number of junk objects, fewer objects need to be copied. In this case, this method is suitable and efficient.
- After the clearing, the memory is not fragmented.
Disadvantages:
- In the case of fewer garbage objects, more objects need to be copied, which is not applicable.
- Only half of the two allocated memory Spaces can be used at the same time, resulting in low memory usage.
1.2.3 JVM improved implementation
Today’s commercial virtual machines all use this collection algorithm to recover the new generation. According to the special research of IBM, 98% of the objects in the new generation are “live and die”. Therefore, it is not necessary to divide the memory space according to the 1:1 ratio, but to divide the memory into a large Eden space and two small Survivor Spaces. Use Eden and one of the pieces Survivor each time.
When recycling is done, the surviving objects in Eden and Survivor are copied to another Survivor space once and for all, and Eden and the Survivor space that was just used are cleaned up. By default, the HotSpot VIRTUAL machine has an 8:1:1 ratio of Eden to Survivor, which means that each new generation has 90% (80%+10%) of available memory, and only 10% of memory is “wasted”. Of course, 98% of object collectables are only data in the general scenario, there is no way to ensure that no more than 10% of objects survive each collection, and when Survivor space is insufficient, other memory (here refers to the old years) is required for allocation guarantee (Handle Promotion).
Memory allocation of guarantee like we go to a bank loan, if we have a good reputation, in 98% of cases can repay on time, so Banks may default we unluckily to repay the loan in the next, only if I can’t need to have a guarantor can guarantee payment, can deduct money from his account, the bank is considered without risk. The same is true of the memory allocation guarantee. If there is not enough space in another Survivor space to hold the surviving objects collected by the previous generation, these objects will enter the old age directly through the allocation guarantee mechanism.
The allocation guarantee mechanism generally takes the average capacity of objects promoted to the old age each time before as the empirical value and compares it with the remaining space of the old age. If the remaining space is enough, only Minor GC will be attempted, otherwise Full GC will be attempted. If the guarantee fails, then a Full GC will follow the Minor GC. More detailed policies are described in the memory allocation and reclamation policies section.
1.3 Mark-Compact Algorithm
1.3.1 principle
Also known as mark-compression algorithm. The marking process is still the same as the mark-clean algorithm, but instead of cleaning up the recyclable objects directly, the next step is to move all surviving objects to one end and then clean up the memory directly beyond the end boundary. The mark-clean algorithm is shown in the following diagram.
1.3.2 the pros and cons
The advantages and disadvantages are the same as the mark clearing algorithm, which solves the problem of fragmentation of the mark clearing algorithm. At the same time, the mark compression algorithm has one more step, which is the step of object moving memory location, and its efficiency also has a certain impact.
2. Comprehensive Algorithm
2.1 Generational Collecting Algorithm
A variety of garbage collection algorithms have been introduced, and each algorithm has its own advantages and disadvantages. No one can replace anyone, so it is wise to choose according to the characteristics of garbage collection objects.
Current mainstream JVM garbage collection uses a comprehensive “generational collection algorithm”, which divides the memory into chunks based on the lifetime of the object, such as the young generation, the old generation, and the permanent generation in the JVM, and then adopts the most appropriate GC algorithm for each generation. In the JVM, the replication algorithm is suitable for the younger generation, and the tag clearing or tag collation algorithm is suitable for the older generation. The heap MEMORY GC model and detailed generation rules are detailed in this article: Overview of the Java Heap Memory GC Model.
2.2 Partition Collection Algorithm (Regin)
In G1 garbage collector of JDK1.7, the latest compound algorithm of “partition collection algorithm” is adopted to cancel the traditional generation mechanism, and the whole heap space with several large generations is divided into several consecutive different cells (Regin). Each cell can be used independently (as Eden, survivor, old, Humongous, etc.) and recycled independently (the basic algorithm used is still the replication algorithm, and has natural local compressibility).
While STW is also required for GC, the small partition approach allows you to control how many cells are recycled at a time, and reasonably recycle several cells (rather than the entire heap) at a time, depending on the target pause time, thereby reducing the pauses generated by a GC and achieving the minimum time delay. The ZGC garbage collector, added in JDK11, also uses the idea of partitioning (Regin). The algorithm and G1 implementation are explained later in the garbage Collector section: An in-depth look at the seven JVM garbage collectors in Java.
References:
- In-depth Understanding of the Java Virtual Machine
If you need to communicate, or the article is wrong, please leave a message directly. In addition, I hope to like, collect, pay attention to, I will continue to update a variety of Java learning blog!