In the previous article, which introduced Java GC: Fundamentals, this article looks at how specific collectors are implemented in the JVM.
The JVM provides multiple garbage collectors to collect both the new and the old, and they can be combined, but in practice (based on Java 8) it is common to see four combinations of garbage collectors
- SerialGC collects Cenozoic and old -XX:+UseSerialGC
- -xx :+UseParallelOldGC -XX:+UseParallelOldGC -XX:+UseParallelOldGC
- Concurrent Mark and Sweep (CMS) -xx :+UseConcMarkSweepGC (-xx :+UseParNewGC)
- G1 (which does not strictly distinguish between old and new generations)
PS: In Java 8, -xx :+UseParallelOldGC and -xx :+UseParNewGC cannot be used independently
Here we will understand how each collector works in principle, but not in detail the implementation.
Serial GC
Serial GC uses the Mark-copy algorithm for the new generation and the Mark-sweep-Compact algorithm for the old generation. As the name suggests, this is a single-threaded collector. The entire collection process triggers stop-the-world to suspend the application until the collection is complete. This collector works with heaps of only a few hundred megabytes and single-core cpus. On the server side, this combination is rarely used because computer resources are not properly used, but on the other hand, it can also be used in cases where system resource usage is limited.
Parallel GC
Parallel GC uses the same algorithm as Serial GC. It’s just multithreaded and is the default GC used by the JVM. As with the Serial GC, stop-the-world is triggered to suspend the application throughout the collection process until the collection is complete. In multi-core environments, the processing speed is faster than Serial GC, which improves throughput.
Concurrent Mark and Sweep (CMS)
The Serial and Parallel GC need to suspend the application thread during garbage collection, so the CMS collector is a better choice if application latency is required. It uses the Stop-the-world Mark-copy algorithm to collect the new generation (ParNew, which is similar but incompatible with Parallel GC and only works with CMS) and the old generation using the Mark-sweep algorithm, which is almost concurrent with the application threads.
Parallel GC is not compatible with ParNew. Parallel GC is compatible with ParNew.
CMS is mainly used to avoid long pauses in old collection. First, it uses the Mark-sweep algorithm without compression (note that due to memory fragmentation, free-lists are used for memory allocation instead of pointer collisions, so allocation is slower). Second, during the mark-Sweep algorithm collection, it is almost concurrent with the application, with little impact on the execution of the application (pausing time is short).
PS: Why didn’t CMS adopt mark-collation algorithm to achieve it?
But every coin has its disadvantages
- Because most of the time CMS uses at least some CPU resources without executing the application’s code, CMS throughput is generally worse than Parallel GC.
- Since Sweep does not compact after it, the problem of memory fragmentation is bound to occur (the JVM provides parameters that can be set to defragment memory after one or more Full GC).
- Because there is a phase of concurrency with the application, unlike other collectors, they cannot wait until the old age outpaces them before collecting, and need to set a threshold. A “Concurrent Mode Failure” error will be raised if the Concurrent process fails and a stop-the-world Full GC will be executed before resuming the application thread until the collection is complete.
- There is floating garbage, that is, objects marked alive and dead that need to be cleaned up in the next GC.
The concurrent collection process of the CMS is as follows:
- Initial marking is a stop-the-world process, but it only takes a very short time to Mark all the objects that GC Roots can reach directly in the old days.
- Concurrent Mark A Concurrent token alive object that does not suspend the user thread. Because the application thread is still running at this point, some new objects may be passed into the old age, or some reference changes may be made to the previous object after it is marked. As mentioned earlier in the basic algorithm, the JVM uses Card Marking to mark references from older generations to younger ones; The same technique is used here, and reference changes generated during concurrency also use a structure similar to CardTable (ModUnionTable) to record reference changes.
- Concurrent Preclean this stage needs to deal with the problems left by the previous stage and re-mark the surviving objects through the objects in Ditry Card. However, garbage objects that were marked alive but are now dead need to be saved for the next cleanup (floating garbage). This stage can be turned off.
- A Concurrent Abortable Preclean phase repeats the operations of the previous phase until certain conditions are met, such as the number of cycles reaching a threshold or the cycle processing time reaching a threshold, etc. The purpose of this phase is to process as many old objects as possible that are updated by the application thread during the concurrent phase to reduce the pause time during the re-marking phase.
- Final remarking. As long as concurrency continues, there will always be new references that are not processed, so a second and Final stop-the-world phase of the process is needed to ensure that all surviving objects are marked.
- Since compact is not required and stop-the-world is not necessary, Concurrent Sweep is just fine. CMS maintains its own free list and does not eliminate new objects promoted at this stage.
- Concurrent Reset Resets internal data structures for future use
In addition to passively triggering Old GC by Mionr GC, there are active, periodic Old GC in practice, but there are trigger conditions that can be consulted by interested parties.
While the CMS garbage collector has been abolished in JDK 9, CMS is still a good alternative to JDK 8 and its predecessors in pursuit of low latency.
G1 – Garbage First
HotSpot provides an alternative garbage collection strategy to address a number of problems with the CMS algorithm. Starting with JDK 9, G1 replaced Parallel GC as the default GC in the JVM. One of the core goals of the G1 is to make stop-the-world times predictable and configurable. For example, you could require stop-the-world pauses of no more than 5 milliseconds per minute, and the G1 will do its best to do that, but not necessarily.
Unlike previous generations of heap memory, G1 divides the heap into a number of small regions (typically 2048), each of which may be Eden, Survior, or Old. Logically, Eden and Survivor regions constitute the new generation, and Old regions constitute the Old age. This design allows the GC to collect only some of these areas each rather than the entire heap. But the new generation participates every time. Another novelty is that G1 can estimate the number of live objects per region, and the regions with the highest number of live objects are collected first, hence the name of the collector, Garbage First.
G1 can be roughly divided into Young GC and Mixed GC. Young GC deals with all Young regions, while Mixed GC deals with all Young regions and some Old regions. As to which Old regions to select, concurrent tags are required to gather the necessary information. The normal workflow of generational G1 is to switch between young GC and Mixed GC as appropriate, followed by periodic concurrent markers to collect data.
Young GC — Recovery of Evacuation stages of Young generation only (Fully Young Pause)
At the beginning of the application, G1 has no additional information to run the concurrent phase. So at this stage, G1 behaves much like other generation-generation collectors in that it needs to stop-the-world, use a replication algorithm, and copy the surviving objects into a Survivor zone, or into a free zone (which later becomes a Survivor zone). This phase can be understood as Young GC (Minor GC). The so-called evacuation is actually copied to other areas.
Concurrent Marking
Many of the concepts of the G1 collector are based on the CMS, so the flow will be somewhat similar. However, there are a number of differences, such as CMS’s incremental update approach, where additional mark changes are referenced and processed, while G1 uses SATB (snapshot-at-the-beginning) to maintain concurrent GC correctness. Snapshot-at-the-beginning is a logical Snapshot of all live objects At The Beginning of The GC. The live objects in this Snapshot, plus The newly allocated objects later in The GC, are considered final live objects. Other unreachable objects are dead. For details on how SATB works, see G1 with R capitalized
Like CMS, this stage can be specifically divided into the following stages. The purpose of each stage is similar to that of CMS, but the specific implementation will be different:
- Initial Mark
- Concurrent marking
- Final Marking (Remark)
- Cleanup
The Initial Mark is completed in the Evacuation Pause. Note the cleanup here, which doesn’t actually cleanup the actual objects on the heap, but rather counts how many live regions each region has and sorts them according to the expected GC efficiency in preparation for the mixed GC. However, if no live object is found in the region, the entire region is reclaimed to the list of regions that can be allocated.
Mixed GC — Mixed Evacuation Pause: Mixed
This stage does not necessarily follow the concurrent mark, but it is not necessary if, for example, a large portion of the old age can be freed at the same time. Therefore, it may be easy to have some recovery only young evacuation phases between the end of concurrent marking and the suspension of mixed evacuation.
In this stage, all regions in the new generation are selected, and several old regions with high returns are collected according to the statistics of concurrent markers (old Gen regions with high returns are selected as far as possible within the cost target specified by users) for collection.