I wish I could be satisfied that all THE people I meet and all the things I do are just a little bit better.
G1(garbage-first) is a new Garbage collector officially used in JDK1.7. G1 has a unique Garbage collection strategy. From the perspective of generation, G1 still belongs to the generational Garbage collector, which can distinguish ages and old ages, and still has Eden and survivor zones. It does not require the whole Eden district, the qing dynasty or the old era to be continuous. It uses a new partitioning algorithm.
Its characteristics are as follows:
-
Parallelism: G1 can be run by multiple GC threads simultaneously during collection, effectively leveraging multi-core computing power.
-
Concurrency: G1 has the ability to alternate execution with applications, so that, in general, applications are not completely blocked during the entire recycle period.
-
Generational GC: Unlike previous collectors, other collectors work either in the young generation or in the old generation. The G1 can be used for both young and old generations.
-
Spatial defragmentation: G1 does a proper object movement during the collection process. Unlike CMS, which simply removes markers, CMS must defragment after several GC cycles. G1 effectively copies objects during each collection to reduce spatial fragmentation.
-
Predictability: Due to partitioning, G1 can select only parts of the region for memory reclamation, which reduces the scope of reclamation and gives better control over global pauses.
G1 memory division and main collection process
The G1 collection collector partitions the heap into regions and collects only a few regions at a time to control the pause time of garbage collection.
The G1 collection process may have four stages:
-
The new generation of GC
-
Concurrent marking cycle
-
Mixed collection
-
Perform Full GC (if required).
Ii. Cenozoic GC of G1
The main work of the new generation GC is to recover Eden region and survivor region.
Once Eden is occupied, the new generation GC will start. Heap data before and after generation GC collection is shown in the figure below, where E stands for Eden region, S stands for survivor region, and O stands for old age.
As you can see, the Cenozoic GC only deals with Eden and Survivor zones. After collection, all Eden zones should be cleared and some of the survivor zones will be collected, but there should still be at least one survivor zone, which doesn’t seem to change much compared to other Cenozoic collectors. Another important change is the increase in the number of regions in the old age, as some of the objects in survivor areas or Eden areas may be promoted to the old age.
Iii. G1 concurrent marking cycle
G1’s concurrency phase is similar to CMS’s in that it reduces the pause time by separating out parts of the application that can be executed concurrently.
Concurrent marking cycles for older generations
The concurrent markup cycle can be divided into the following steps:
-
Initial tag: Marks objects directly reachable from the root node. This phase is accompanied by a new generation GC, which causes global pauses, and the application must stop executing during this phase.
-
Root region scan: Since the initial tag is necessarily accompanied by a new generation GC, after initializing the tag, Eden is cleared and the surviving object is moved to the survivor zone. In this phase, old-age regions that are directly reachable by survivor regions are scanned and those directly reachable objects are marked. This process can be performed concurrently with the application. But the root area scan cannot occur at the same time and the new generation of GC (because the root area scanning the object of depend on survivor area, and the new generation of GC will modify this area), so if you happen to need at this time new generation GC, GC will need to wait for the root area can only be carried out after the scanning, if this happens, the new generation of GC time will be longer.
-
Concurrent tags: Similar to CMS, concurrent tags scan and find the entire heap of live objects and mark them. This is a concurrent process, and this process can be interrupted by a new generation GC.
-
Relabelling: As with CMS, relabelling can cause the application to pause. Because the application is still running during the concurrent marking process, the results of the marking may need to be fixed, so the previous marking is replenished at this stage. In G1, this process is done using The SNAPshot-at-begining (SATB) algorithm, where G1 creates a Snapshot of The living object At The beginning of tagging, which helps speed up The re-tagging process.
-
Exclusive cleanup: As the name suggests, this phase causes a pause. It calculates and sorts the percentages of live objects and GC collections for each region, identifying regions that are available for mixed collections. At this stage, the memory set is also updated. This stage gives and marks the areas that need to be mixed for recycling. This information is needed in the mixed recycling stage.
-
Concurrent cleanup phase: Identify and clean up completely free areas. It is a concurrent cleanup that does not cause a pause.
SATB stands for snapshot-at-the-beginning, which is literally a Snapshot of an object that is alive At The Beginning of GC. It is developed through Root Tracing to maintain the correctness of concurrent GC. So how does it maintain the correctness of concurrent GC? According to the three-color marking algorithm, we know that the object has three states: white: the object is not marked and will be recycled as garbage after the marking stage. Gray: The object is marked, but its field has not been marked or finished marking. Black: The object is marked and all its fields are marked.
SATB uses write barrier to record The old references of all reference relationships that are to be deleted, and finally rescan these old references as root to Stop The World to avoid missing marks. Therefore, there is an essential difference between G1 Remark phase Stop The World and CMS Remark phase, that is, The pause only needs to scan The object with The root of The object being chased by The write barrier, while CMS Remark phase needs to rescan The whole root set. Thus CMS Remark is likely to be very slow.
Four, mixed recycling
In concurrent markup cycles, some objects are reclaimed, but the percentage of reclaimed objects is very low. However, after the concurrent marking cycle, G1 knows exactly which areas have the most garbage objects, and can be specifically targeted for collection during the mixed collection phase. Of course, G1 prioritizes areas with a high percentage of Garbage (where recycling is cost-effective), hence its name. Garbage First refers to the area with the highest percentage of Garbage being collected First.
This phase is called mixed collection because in this phase, the normal young generation GC is performed and some marked old age areas are selected for collection, while both Cenozoic and old age are treated.
The mixed collection is performed several times until enough memory is reclaimed, and then it triggers a new generation GC. After the new generation GC, a concurrent marking cycle may occur, and finally lead to mixed collection, so the whole process may be as follows:
5. Full GC if necessary
Similar to CMS, concurrent collections alternate between application and GC threads, so it’s inevitable that in particularly busy situations, G1 will run out of memory during the collection, and when that happens, G1 will switch to a Full GC for collection.
There are four situations that trigger this type of Full GC:
1. Concurrent mode failure
G1 starts the marking cycle, but the old age fills up before Mix GC, at which point G1 abandons the marking cycle. In this case, either increase the heap size or adjust the cycle (e.g., increase the number of threads -xx :ConcGCThreads, etc.).
The following is an example of GC logs:
Workaround: This failure means that the heap size should be increased, or that the G1 collector’s background processing should start earlier, or that the cycle needs to be adjusted to make it run faster (for example, increasing the number of threads for background processing).
2. Failing to get promoted
(To-space Exhausted or to-space Overflow)
The G1 collector completes the marking phase and starts the hybrid garbage collection to clean up the old age partitions, but the old age space will run out before garbage collection frees up enough memory. (G1 does not have enough memory for live or promoted objects at GC time), thus triggering the Full GC.
In the following log (you can see in the log (to-space Exhausted) or (to-space Overflow)), the reaction is a mixed GC followed by a Full GC.
This failure usually means that hybrid collections need to complete garbage collection more quickly: each new generation garbage collection needs to deal with more old partitions.
The solution to this problem is:
-
Increase the value of the -xx :G1ReservePercent option (and increase the total heap size accordingly) to increase the amount of memory reserved for the “target space”.
-
By reducing – XX: InitiatingHeapOccupancyPercent start marking cycles ahead of time.
-
You can also increase the number of parallel tagged threads by increasing the value of the -xx :ConcGCThreads option.
3. Evacuation failed
(To-space Exhausted or to-space Overflow)
For the new generation garbage collection, there is not enough space in the Survivor space and the old age to hold all the surviving objects. This is typically the case in GC logs:
This log indicates that the heap is almost completely exhausted or fragmented. The G1 collector tries to fix this failure, but you can expect the results to be even worse: the G1 collector will switch to the Full GC.
The solution to this problem is:
-
Increase the value of the -xx :G1ReservePercent option (and increase the total heap size accordingly) to increase the amount of memory reserved for the “target space”.
-
By reducing – XX: InitiatingHeapOccupancyPercent start marking cycles ahead of time.
-
You can also increase the number of parallel tagged threads by increasing the value of the -xx :ConcGCThreads option.
4. Humongous Object fails to be allocated
When the Humongous Object cannot find a suitable space to allocate, the Full GC is started to free up space. In this case, avoid allocating a large number of mega objects, increase memory or increase -xx :G1HeapRegionSize, so that the mega object is not a mega object.
Another way to deal with Humongous objects is to switch the GC algorithm to THE ZGC, where the collection of Humongous objects is not handled in a special way (such as without delayed collection).
6. Mega-objects
The Humongous Object is Humongous regions
For G1, anything more than half the size of a Regin is considered a giant object. Mega objects are assigned directly to “mega regions” in the old age. These giant regions are a continuous set of regions. StartsHumongous marks the beginning of the series, and ContinuesHumongous marks its continuation.
Before assigning a giant object, check whether the critical heap occupancy percent and the marking threshold are exceeded. If so, start global Concurrent marking for early reclamation. Prevent evacuation failures and Full GC.
For giant objects, there are a few things to note:
-
Large objects that are not referenced are released during the mark cleanup phase or during the Full GC.
-
To reduce copy load, large regions are compressed only during Full GC.
-
Each region has only one giant object. The remaining part of the region is not utilized, resulting in heap fragmentation.
-
If frequent concurrent reclamation occurs due to large object allocation and large objects need to be turned into common objects, you are advised to increase Region size. (Or switch to ZGC)
One downside of increasing Region size is that it reduces the number of regions available. Therefore, in this case, you need to test accordingly to see if you actually improve the throughput or latency of your application.
7. Common tuning parameters
1, – XX: MaxGCPauseMillis = N
Default: 200 ms
The most basic parameters for using GC were described earlier:
-XX:+UseG1GC -Xmx32g -XX:MaxGCPauseMillis=200
MaxGCPauseMillis = MaxGCPauseMillis This parameter is literally the maximum pause time allowed for GC. G1 tries to ensure that each GC pause is within the set MaxGCPauseMillis range. So how does the G1 achieve maximum pause times? This refers to another concept, CSet(Collection set). It means a collection of regions that are collected in a garbage collector.
-
Young GC: Select all Cenozoic regions. Control the overhead of the Young GC by controlling the number of regions in the new generation.
-
Mixed GC: Select all regions in the new generation, plus some old regions with high revenue according to global Concurrent marking statistics. Select the old region with high income as far as possible within the cost target range specified by the user.
With that in mind, we can set the maximum pause time to get a sense of direction. First, there is a limit to the maximum pause time we can tolerate, and we need to set it within that limit. But what value should you set? We need a balance between throughput and MaxGCPauseMillis. If MaxGCPauseMillis is set too small, GC will be frequent and throughput will decrease. If MaxGCPauseMillis is set too high, the application pause time will be longer. The G1’s default pause time is 200 milliseconds, so you can go from there and adjust the time appropriately.
2, – XX: G1HeapRegionSize = n
Set the size of the G1 region. The value is a power of 2 and ranges from 1 MB to 32 MB. The goal is to partition about 2048 regions based on the minimum Java heap size.
- -xx :ParallelGCThreads=n (adjust the number of background threads for G1 garbage collection)
Sets the value of the number of STW worker threads. Sets the value of n to the number of logical processors. The value of n is the same as the number of logical processors, up to 8.
If there are more than eight logical processors, set the value of n to about 5/8 of the number of logical processors. This applies in most cases, except for larger SPARC systems, where the value of n can be about 5/16 of the number of logical processors.
-xx :ConcGCThreads=n (adjust the number of background threads for G1 garbage collection)
Sets the number of threads for parallel marking. Set n to about 1/4 of the number of parallel garbage collection threads (ParallelGCThreads).
3, – XX: InitiatingHeapOccupancyPercent = 45 (adjust the G1 garbage collection operation frequency)
Sets the Java heap usage threshold for trigger marking cycles. The default usage is 45% of the entire Java heap.
Setting this value too high: you get stuck in the Full GC mire because the concurrent phase does not have enough time to complete garbage collection before the remaining heap space is filled.
If this value is set too low: the application does a lot of background processing again at a pace that exceeds what is needed.
Avoid using the following parameters: Avoid explicitly setting the young generation size with the -xmn option or other related options such as -xx :NewRatio. Fixed the size of the young generation to override the pause time target.
Eight, the details
1, G1 mixed GC timing
Mixed in the gc has a threshold parameter – XX: InitiatingHeapOccupancyPercent, when the percentage of the total heap size s size reaches the threshold, will trigger a mixed gc.
Before allocating humongous objects, check if there is longer than the heap that is likely to be used. If so, start global Concurrent marking for early reclamation. Prevent evacuation failures and Full GC.
To reduce the impact of continuous H-objs allocation on GC, large objects need to be turned into normal objects. It is recommended to increase Region size.
The size of a Region can be set using -xx :G1HeapRegionSize. The value ranges from 1M to 32M and is an index of 2.
XX: G1 HeapRegionSize default value?
By default, 2048 pieces of heap memory are evenly divided to achieve a reasonable size.
3. Direct memory configuration
Q: When do YOU use direct memory?
A: In the case of frequent reads and writes, direct memory can be used for performance reasons.
Direct memory is also an important part of Java programs, especially as NIO becomes more widely used. Direct memory can skip the Java heap, giving Java programs direct access to the native heap space. Therefore, memory access speed can be accelerated to a certain extent. Direct memory can be set with -xx :MaxDirectMemorySize, which defaults to the maximum heap space, which is -xmx. When the direct memory reaches its maximum value, garbage collection is also triggered. If garbage collection fails to effectively release space, direct memory overflow still causes OOM.
In general, direct memory has the advantage of accessing direct memory for reading and writing (faster), but it has no advantage when it comes to allocating memory space.
4, RSet
Remembered Set is a structure that helps GC processes. It is a typical space swap tool, similar to Card tables. The RSet of G1 is implemented on the basis of Card Table: each Region records the Pointers of other regions and marks which Card ranges these Pointers belong to. The RSet is actually a Hash Table. The Key is the start address of another Region, and the Value is a set. The element in the RSet is the Index of the Card Table.
How exactly does RSet assist GC?
When doing YGC, you only need to select rsets of the Young Generation region as the root set. These Rsets record cross-generation references of old->young, avoiding scanning the whole old generation. However, in mixed GC, RSet of old->old is recorded in old generation, and reference of young->old is obtained by scanning all young generation regions, so it is not necessary to scan all old generation regions. So the introduction of Rsets greatly reduces the amount of GC work.
New features in G1 in JDK 12
1, Can interrupt mixed GC
Make the Mixed GC aborted if G1 has a possibility of exceeding the pause target.
G1 returns immediately without using allocated memory
Enhance the G1 garbage collector to automatically return Java heap memory to the operating system when idle.
X. DEVELOPMENT trend of GC
In fact, you can see the trend of Java garbage collector, which is to reduce the impact of the application as much as possible under the conditions of large memory heap; From The phased incremental marking of CMS, to G1 correcting The impact of Stop The World in The remark phase through SATB algorithm, to ZGC/C4 not even stopping The World in The marking phase.
Xi. Conclusion
Several ways to learn this GC are recommended:
-
See the JEP (JDK Enhancement Proposal) for the ins and outs.
-
Shenandoah GC paper (Shenandoah GC paper before, I had a feeling of great harvest, because Shenandoah GC processing mode, is between G1 and ZGC, Therefore, I have a deeper understanding of G1 and ZGC after reading Shenandoah GC Paper.
At the end of the article, I will add the official website address of JEP and the github address of some GC materials I collected (including some papers).
Add a GC graph that I have summarized myself:
The various GC algorithms are all built around the content in the diagram, but each one has a different way of processing it.
Recommended materials:
1. GC algorithm and paper
Github.com/jiankunking…
2. Recommended Books on Java
Github.com/jiankunking…
reference
1. Practical JAVA VIRTUAL machine JVM fault diagnosis and performance optimization
2, jeps
3, other
www.oracle.com/technetwork…
Plumbr. IO/faced/gc…
Xii. Blogger information
Personal wechat official Account:
Personal blog
Individuals making
Personal gold digger blog
Personal CSDN blog