An overview of the
G1 (Garbage-first) was released in JDK 6u14 and JDK 7u4, and has become the default Garbage collector in JDK9 as an alternative to the CMS collector (which has been deprecated since JDK9).
G1 is a generational, incremental, parallel, evacuating, soft, real-time garbage collector. The main feature is that the pause time is configurable, we can configure a maximum pause time, G1 will try to recycle while keeping the program pause time within the allowable range, and perform better in large memory environment.
Before I dive into the details of G1, a few basics about garbage collection are brief:
Some basics of garbage collection
mutator
In garbage collection, a mutator is an application. As for why this strange name?
A mutator is a word coined by Edsger Dijkstra that means to change something. Speaking of changing anything, well
Is the reference relationship between GC objects. However, this may not be understood, but in a word, the entity is “application”. That makes sense. It is within this mutator that GC is working energetically.
Incremental garbage collection
Incremental GC is an approach to control the maximum pause time of a mutator by progressively advancing the garbage collection.
Some of the early, younger generation garbage collectors, such as Serial GC, are completely paused. Incremental garbage collection is simply a way of having a GC program run alternately with a Mutator, where the garbage is actually collected little by little, so it’s called “Incremental.”
G1 is an incremental garbage collector that runs alternately with mutators to reduce the amount of time a program pauses due to GC
Parallel GC and concurrent GC
Parallel and concurrent GC are basic concepts in garbage collection. These two words are very vague and easy to confuse, but in the world of GC, let’s use the GC interpretation.
In general, a multi-threaded GC is referred to as a parallel/concurrent GC, although the two terms have completely different meanings in GC.
A parallel GC pauses the mutator and then starts multiple threads to execute the GC in parallel, as shown in the figure below:
Concurrent GC is to start GC threads to execute GC in parallel without pausing the Mutator, as shown in the figure below:
The purpose of parallel GC is to improve GC efficiency and reduce pause times, while the purpose of concurrent GC is to eliminate pause times altogether
A predictable
G1 execution flow
The structure of the heap in the G1 GC is different from that of other collectors. In G1, the heap is divided into N equal-sized regions, each of which occupies a contiguous address space and is garbage collected on a Region basis, and the size of the Region is configurable. During allocation, if the selected region is already full, the next free region is automatically found to perform allocation.
G1 is a generational garbage collector, which also divides the heap into Young and Old, and divides the region into Young and Old. However, unlike other garbage collectors, the region space of the generations in G1 is not contiguous.
Here’s why G1 uses discontinuous regions for different generations. Because the G1 Heap only classifies regions at the beginning, it only needs to select a storage object from the free-list during object allocation, so that region allocation is more flexible.
When a GC occurs, the Eden area is cleared and then left as a free area, which can later be left as an old age or as a Survivor area. However, G1 will check whether the total size of the new generation area exceeds the new generation size limit during allocation. If so, GC will be performed
Although the size of each region is limited, there are some cross-region cases for large humongous objects that occupy large areas. For objects that span regions, multiple contiguous regions are allocated.
The regions in G1 are mainly divided into two types:
-
Young generation region
- EDEN region – the newly allocated object
- Survivor region – An object that survives the GC of a young generation but does not need to be promoted
-
Old age area
- Promoted to the old age
- Large objects that are allocated directly to the old age, objects that occupy multiple regions
The heap structure in G1 is shown below:
Different from other garbage collection methods, the young generation/old generation collection algorithm of G1 is the same, which belongs to the mobile/transfer collection algorithm. For example, the replication algorithm, which belongs to the mobile recycling algorithm, has the advantage of no debris, and the less the survival, the higher the efficiency
RememberedSet
For example, when retrieving a region, objects in that region can be trawled from GC ROOT. Objects in that region can be moved to other regions due to promotion or movement, but still retain references to objects in the original region. The objects referenced in the region are not “direct” to the GC Root; they are referenced by the region of another object, which is accessible to the GC Root. In this case, if you want to correctly mark objects that are not accessible to GC ROOT but referenced by other regions, you will need to traverse all regions, which is expensive.
As shown in the figure below, if heap region A is being collected at this time, then all surviving objects in heap region A need to be marked. However, two objects in heap region A are referenced by other regions. These two gray question mark objects in heap region A are not reachable to GC ROOTS, but in fact the reference objects of these two objects are referenced by GC ROOTS. So these two objects are still alive. If these two objects are not tagged at this point, it will result in missing tags, which can lead to the problem of false collection
A rememberedSet (RS or rsets for short) solves this problem, and a RSet records such relationships across generations. When tagging, in addition to traversing from the GC Roots, it also traverses from the RSET, making sure to mark all surviving objects in the region (not only G1, but also other generational collectors, such as CMS).
As shown in the figure below, G1 uses an RSet to record the cross-region reference relationship. Each region has an RSet to record the cross-region reference. In this way, when marking, the RSet can also be used as a root to traverse
So when an object is promoted, the promoted object is recorded. This container for storing the cross-reference relationship is called an RSet, which is implemented in G1 through Card Table.
Note that when we say CardTable implements RSet, we don’t mean that the CardTable is the data structure behind RSet, just that the RSet stores the CardTable data
Card Table
In the G1 heap, there is the data of a CardTable, which is implemented by an array of elements 1B. The elements in the array are called cards/pages. So this CardTable is going to map to the space of the entire heap, and each card is going to correspond to 512 bytes of space in the heap.
As shown in the figure below, under a heap size of 1GB, the length of the CardTable is 209,7151 (1GB / 512B); Each Region is 1 MB in size, and each Region corresponds to 2048 Card pages.
To find the CardPage of an object, a simple calculation is required:
Now that we’ve covered card tables, let’s talk about how RSet and CardTable work together in G1.
Each region has an RSet, implemented through a hash Table, where the key is the address that references the rest of the region, and the value is an array whose elements are the subscripts of the Card Page corresponding to the object that references the region in the Card Table.
As shown in the figure below, object B in region B refers to object A in region A, and the reference relationship spans two regions. The CardPage of the object b is 122. In the RSet of region A, the address of region b is taken as the key, and the subscript of the CardPage of the object b is value to record this reference relationship, thus completing the recording of the cross-region reference.
But the granularity of this CardTable is a little bit coarse. After all, a CardPage has 512B, so there can be multiple objects in a CardPage. So when we scan a token, we need to scan the entire CardPage associated in the RSet.
Write barrier
Write barriers are also a key technique in GC (not membarriers in Linux). When a reference is updated, a Write Barrier (in this case a transfer Write Barrier) is used to record a change in the reference, which is simply a set of functions like this (pseudocode) :
Def evacuation_write_barrier(obj, field, newobj){var evacuation_write_barrier(obj, field, newobj){if(! Check_cross_ref (obj, newobj)){return} if(is_dirty_card(obj)){return} to_dirty(obj); // Add obj to rs add_to_rs(obj, newobj) of the region where newobj is located; }
For the sake of understanding, the above pseudocode has been shielded from a few details, so you can get to the core of what’s going on
However, in G1, there is more than one type of write barrier. There are also write barriers as described in SATB, but I will not go into them here
Generational recycling
There are two collection modes in G1:
- Fully Young GC (Fully Young GC)
- Partially young GC is also called partially young collection and partially Mixed GC.
Full young generation GC is a mode in which only young generation regions (Eden/Survivor) are selected into the Collection Set (CSet) for Collection. The process of the young-generation GC is similar to that of other generation-generation collectors. Newly created objects are allocated to the EDEN region, then the Survivor objects are moved to the Survivor region, and those that have reached the promotion age are promoted to the old Survivor region, and then the Survivor region is cleared (although there is no swapping of the two Survivors in the young-generation replication algorithm).
The young generation GC selects all young generation regions to be added to the collection, but the maximum number of young generation regions is adjusted after each GC to meet the user’s pause time configuration, and the number of regions collected per collection may vary
Here is a simple schematic of a fully young generation GC procedure: Move all surviving objects in the selected young generation region to the Survivor region, and then empty the Survivor region
The above is just a simple recycling process schematic, the following details of the young generation of recycling process
Young generation garbage collection (full young generation GC)
Younger generation garbage collection is triggered when the JVM is unable to allocate a new object to the EDEN region (Younger generation garbage collection is fully paused, although parts of the process are parallel, but pause and parallelism do not conflict). It’s also called “evacuation pause”
Step 1. Choose the collection set. G1 will select a maximum number of young band regions on the basis of following the upper limit of GC pause time set by users, and take all young generation regions of this number as the collection set.
As shown in the figure below, at this time, the three young generation regions of A/B/C have all been used as collection sets. The object A in region A and the object E in region B are directly referenced by ROOTS (RS is directly referenced to the object in the figure for simplicity, but actually RS refers to the CardPage where the object is located).
Step 2. Root Scanning. Next, you need to traversal the GC Roots, looking for objects from the Roots straight to the collection set, moving them to the Survivor area and adding their reference objects to the tag stack
As shown in the figure below, in the root stage, the two objects A/E referenced directly by GC ROOTS are copied directly to Survivor region M, and all objects on the path referenced by the two objects A/E are added to the Mark Stack, including e->, c->, F, This F object will also be added to the tag stack
Step 3. Scan RS. Lookthrough the rsets as ROOTS to find objects that reach the collection set, move them to the Survivor region and add their reference objects to the mark stack
Before rsets are scanned, there is a step to Update rsets (Update RS), since rsets write the logs and then process the logs to maintain the rsets by a refining thread. Here, Update rsets to make sure that the rsets are processed and that the rsets are complete before they can be scanned
As shown in the figure below, the reference relation in the old period region C that refers to the young generation A is recorded in the RSet of the young generation. At this time, the RS is traverse to refer the D object in the old period region C to the young generation AB object, add to the tag stack
Step 4 (Evacuation/Object Copy) (first section) (first section) (section section) (section section)
As shown in the figure below, the C/F/B objects recorded in the mark stack are moved to the Survivor region
When an object’s age exceeds the threshold for promotion, the object is moved directly to the old age region instead of the Survivor region.
After the object is moved, you need to update the referenced pointer. See my article “Garbage collection algorithm implementation – copy (full runnable C code)” for more details.
All that is left is the finishing touch, Redirty (with concurrent markers below), Clear CT (clean Card Table), Free CSet (clean collection), clean up the area before moving and add it to the Free area, etc. These operations are usually very short
Mixed collection (partial young generation GC)
Hybrid collection, also known as partial young generation GC, selects all young generation regions (Eden/Survivor) (maximum number of young generation partitions) and some old generation regions into the collection for collection. The young generation region object moves to the Survivor region, and the old age region moves to the old age region. Since the old age area in G1 is recovered in the same “mobile” way as the new generation, the recovered area is completely emptied after being moved, so it does not have the same fragmentation problem as other recyclers (such as CMS) that use the scavenging algorithm.
Here is a simple schematic of a partial young generation GC process:
The execution process of mixed collection mainly includes two steps:
- Concurrent marking – Incrementally concurrent mark alive objects that are also marked for reference updates to mutators during marking
- Movement/Transfer (evacuation) – Reuse code for the same movement as for the younger generation, with the major difference that concurrent marking results are processed as well
Concurrent tags
The purpose of concurrent markup is to mark living objects in preparation for the movement process. During concurrent tagging, the tag of the living object runs concurrently with the Mutator. So this markup process, which refers to updates that change, is the most complex part of the concurrent markup process.
The concurrent markup design of G1 is based on the CMS collector, so the overall markup process is very similar to concurrent markup in CMS. Concurrent tags mark all living objects in the range, so unmarked objects are garbage to be collected (the purpose of “concurrent” here is to reduce Mutator pause times).
When the memory used in the old age plus the memory to be allocated this time amounts to 45% of the total memory, mixed collection is started and concurrent marking is performed.
Instead of marking objects directly, concurrent marking uses a separate data container called MarkBitmap, which uniformly marks all objects in the region. When moving, the bitmap can be used to determine whether objects are alive or not
Mark the bitmap
Within each region there are two Mark bitmaps: Next and Prev. Next is the tag bitmap of the current tag, while prev is the tag bitmap of the last tag, saving the result of the last tag.
The tag bitmap is mapped to the data bits in the tag bitmap by the address of the object. Each bit in the tag bitmap represents the tag state of an object, as shown in the figure below:
In each region, there are four Pointers that mark the location: bottom, top, nextTAMS, and prevTAMS. Because it is complicated by mark, mark the Mutator will allocate objects at the same time, modify, references, and concurrent tags (here refers to the concurrent mark a phase) will be interrupted by the young generation of GC, after the interruption continue to need based on the last point to continue to interrupt tags, so this a few Pointers can be understood as record change point, it was a bit of staging point in the game.
As shown in the figure below, an area is marked before. “Bottom” represents the bottom of the region, “top” represents the top of the region’s memory (even if the amount is used), “TAMS” (top-at-mark-start), “prevTAMS” and “nextTAMS” are the markup information for the last/next time
When marking, if the region allocates some more objects, then the top pointer will movetop-nextTAMSIs a new object in the process of marking
The process of concurrent markup is divided into the following steps:
Initial Mark
Marking objects directly referenced by the root (STW) is done in the young generation GC, although not every young generation GC initializes the marking.
Concurrent Mark
The result of the marking in step 1 is used as root to traverse the reachable object for marking, parallel with the mutator, and can be interrupted by the young GC, which can continue marking after the young GC is completed
SATB
SATB (Snapshot At The Beginning)Is a means of storing the reference relationships between objects at the beginning of the concurrent marking phase as a logical snapshot
This explanation is a little… In the process of concurrent marking, the current reference relationship is used as the basis for reference data, without considering the modification of the reference relationship by the Mutator concurrent runtime (the origin of the name of the Snapshot). If the existing state is marked, it will be considered as the existing state, and at the same time, the SATB Write Barrier will be used to record reference changes. For a detailed explanation of SATB, please refer to my other article “Some Understanding of SATB”.
Final mark (Remark)
Marks missing objects, mainly SATB-related ** (STW)
Cleanup
Count the number of active objects in the marked area, clean up the area with no living objects (no living objects after the mark is not a serious recovery phase), sort the area, etc. (partial STW)
Mixed collection
The mixed collection here refers to the collection process under the mixed collection GC. After the concurrent markup is completed, the mixed collection phase is the same as the young generation GC, and the live objects can be collected by traversal from the result of the concurrent markup /ROOTS/RSet, with the addition of the old region collection.
Full GC
When the mixed collection can’t keep up with the memory allocation and the old age is Full, a Full GC is done to collect the entire heap. Full GC in G1 is also single-threaded serial and fully paused, using a mark-collate algorithm, which is very expensive.
Control of pause times
Although G1 is also fully paused in the process of moving, G1 is variable in the selection of collection set. Only part of the region is selected for collection each time, and the time occupied by each collection is guaranteed by calculating the predicted pause time of each region. To put it simply, break a complete GC into several short GCs to reduce the pause time, keeping each pause time within the user’s configuration range (-XX:MaxGCPauseMilli).
Young generation size configuration
In order to control pause times, the young generation maximum area is adjusted dynamically. However, if the young generation size is set manually, such as XMN /MaxNewSize/NewRatio, etc., and the young generation maximum and minimum are the same, then this maximum area adjustment is disabled. This may result in the failure of pause time control (because the young GC selects all regions, too many regions will cause pause time to increase).
So try not to set the size of the young generation in G1, and let G1 adjust automatically
Log interpretation
Young generation GC log (fully young generation)
//[GC pauses (G1 Evacuation pauses) (Young) for the first part of this section) [Evacuation GC Pause (G1 Evacuation) (Young), 0.0182341 secs for this section] 16.7 ms and GC Workers: 8] /* This line of information shows the start time of the 8 threads, Min represents the earliest start time, Avg represents the average start time, Max represents the latest start time, */ [GC Worker Start (ms): */ [GC Worker Start (ms): 186.1 Min: 184.2, Avg: 184.7, Max: 186.1, Diff: 1.90] /* The root processing time includes the time of all strong roots, divided into Java root, These are Thread, JNI, CLDG; and StringTable, Universe, and JNI under the JVM root Handle, ObjectSynchronizer, FlatProfiler, Management, SystemDictionary, JVMTI */ [Ext Root Scanning (ms): 0.3 0.2 0.1 0.1 0.0 0.0 0.0 Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, Sum: 0.1 */ [Thread Roots (ms) */ [Thread Roots (ms)) */ [Thread Roots (ms)) */ 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [StringTable Roots (MS): 0.0 0.1 0.1 0.1 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4] [Universe Roots (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Jni Handles Roots (ms): ObjectSynchronizer Roots (MS): ObjectSynchronizer Roots (MS): ObjectSynchronizer 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [FlatProfiler Roots (MS): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [SystemDictionary Roots (MS): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [CLDG Roots (MS): 0.3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, AVG: 0.0, MAX: 0.3, DIFF: 0.3, SUM: 0.3] [JVMTI Roots (MS): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0 0.0] // Codecache Roots are actually statistics at the time of processing RSET, which includes the following // UpdateRS, ScanRS, and Code Root Scanning [Codecache Roots (MS): 2.2min: 0.6, Avg: 2.7, Max: 5.0, Diff: 4.4, Sum: 21.6] [CM Refprocessor Roots (MS): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Wait For Strong CLD (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Weak CLD Roots (MS): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, AVG: 0.0, Max: 0.0, Diff: 0.0, SUM: 0.0] [SatB Filtering (MS): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0 0.0] // This is how long it takes for the GC thread to Update the RSet. Note that this does not correspond to how long it takes for the GC thread to process the RSet // in Refine, since they are processed by different threads [Update RS (ms)] : Min: 0.6, Avg: 2.7, Max: 5.0, Diff: 4.4, Sum: 23.15] // This is the number of DCQ in the white area Processed by the GC thread [Processed Buffers: 8 8 7 8 8 7 7 7 2 4 Min: 2, AVG: 6.5, MAX: 8, DIFF: 6, SUM: 52] // Scan RS (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0 0.0] [Code Root (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.1] // This is how long it takes for all living objects (except those directly referenced by the strong root, which will be copied directly when processed by the Java root) to Copy // to the new partition. Min: 11.3, Avg: 13.3, Max: 14.3, Diff: 3.0, Sum: 1, Sum: 1, Min: 11.3, Avg: 13.3, Max: 14.3, Diff: 3.0, Sum: 1, Min: 11.3, Avg: 13.3, Max: 14.3, Diff: 3.0, Sum: // Termination (Termination) // Termination (Termination) // Termination (Termination) // Termination (Termination) // Termination (Termination) // Termination (Termination) // Termination (Termination) // Termination (Termination) [Termination Attempts: 1 111111min: 1, AVG: 1.0, MAX: 1, DIFF: 0, SUM: 8] // This is the amount of time spent on parallel processing, usually due to JVM destructions, etc. [GC Worker Other (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] // GC Worker Total (ms): 16.6 16.6 16.5 16.5 16.4 14.7 14.7 Min: 14.7, Avg: 16.1, Max: 16.6, Diff: 1.9, Sum: 128.7] // GC Worker End [ms] : 200.8 200.8 200.8 200.8 200.8 200.8 200.8 200.8 [Code Root Fixup: 0.0 ms] [Code Root Purge: 0.0 ms] [Code Root Fixup: 0.0 ms] 0.0ms] // Clear CT: 0.1ms [Other: 1.5ms] // select CSet time, YGC is 0 [Choose CSet: 0.1ms] // select CSet time, YGC is 0 [Choose CSet: 0.1ms] // select CSet time, YGC is 0 [Choose CSet: 0.1ms] [Ref Proc: 1.1ms] // Reactivation of references [Ref Proc: 1.1ms] // Reactivation of references [Ref Enq: 1.2ms] // Reactivation of references [Ref Proc: 1.2ms] // Reactivation of references [Ref Enq: 1.2ms] 0.2 ms] / / refactoring RSet time [Redirty Cards: 0.1 ms] [the Parallel Redirty: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [Redirtied Cards: 8118 7583 6892 4496 0 0 0 Min: 0, Avg: 3386.1, Max: 8118, Diff: 0.0, Sum: 0.1] [Redirtied Cards: 8118 7583 6892 4496 0 0 0 Min: 0, Avg: 3386.1, Max: 8118, Diff: 0] [Humongous Register: 0.0ms] [Humongous Total: 0.0ms] [Humongous Register: 0.0ms] [Humongous Total: 0.0ms] [Humongous Total: 0.0ms] There are two objects on which you can Reclaim them. There are two objects on which you can Reclaim them. [Humongous Reclaim: 0] [Free CSET: 0.0ms] [Young Free CSET: 0.0ms] [Free CSET: 0.0ms] [Young Free CSET: 0.0ms] [Non-Young Free CSET: 0.0ms] [Eden: 0.0ms] [Non-Young Free CSET: 0.0ms] [Eden: 0.0ms] [Eden: 0.0ms] [Non-Young Free CSET: 0.0ms] 15.0 M (15.0 M) - > 0.0 B (21.0 M) Survivors: 2048.0 K - > 3072.0 K Heap: 23.7 M (256.0 M) - > 20.0 M (256.0 M)]
Old age garbage collection (partial young generation/mixed collection) log
Concurrent mark log
Concurrent markup is global, and the collection process is two phases, so concurrent markup can be said to be independent.
// Concurrent markup - Initial markup phase, completed in young generation GC 100.070: [GC Pause (G1 Evacuation Pause) (Young) (Initial-mark), 0.0751469 secs] [Evacuation Pause (G1 Evacuation Pause) (Young) (Initial-mark), 0.0751469 secs] 8] [GC Worker Start (MS): Min: 100070.4, AVG: 100070.5, Max: 100070.6, Diff: 0.1] [Ext Root (MS): Min: 0.1, Avg: 0.2, Max: 0.3, Diff: 0.2, Sum: 1.6] [Update RS (ms): Min: 0.6, Avg: 1.1, Max: 1.5, Diff: 0.9, Sum: 1.6] 8.12] [Scan RS (MS): Min: 1.0, AVG: 1.4, Max: 1.9, Diff: 0.9, Sum: 10.8] [Code Root (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0] [Code Root (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] 71.5, Avg: 71.5, Max: 71.6, Diff: 0.1, Sum: 572.1] [Termination (ms): Min: 0.3, Avg: 0.3, Max: 0.4, Diff: 0.1, Sum: 572.1] 2.6] [Termination Attempts: Min: 1382, AVG: 1515.5, Max: 1609, Diff: 227, Sum: 12124] [GC Worker Other (MS): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 74.5, Avg: 74.5, Max: 74.6, Diff: 0.1, Sum: 0.2] 596.3] [GC Worker End (MS): Min: 100145.1, Avg: 100145.1, Max: 100145.1, Diff: 0.0] [Code Root Fixup: 0.0ms] [Code Root Purge: 0.0ms] [Clear CT: 0.1ms] [Other: 0.1ms] [Choose CSet: 0.0ms] [Ref Proc: 0.1ms] [Ref Enq: 0.1ms] [Reclaim] [Reclaim: Reclaim] [Reclaim: Reclaim] [Reclaim: Reclaim] [Reclaim: Reclaim] [Eden: Reclaim] Survivors: 4096.0K->4096.0K Heap: 44.9M (128.0M)-> 46.9M (128.0M)] [Times: Survivors: 409M (128.0M)-> 44.9M (128.0M) User =0.63 sys=0.00, real= 0.08secs] // Survivor root in YHR [GC concurrent-root-territory-scan-start] // Concurrent flag root scan finished, took 0.0196297, note that the scan and Mutator are run concurrently, and multiple threads are running concurrently 100.165: [GC concurrent-root-region-scan-end, 0.0196297 secs] // Start the concurrency marking subphase, where the entire heap is marked from all root references, including Survivor and strong roots such as stack. 100.165: [GC concurrent-mark-start] // mark end, cost 0.08848s 100.254: [GC concurrent-mark-end, 0.0884800 secs] // This is the relagging subphase, including relagging, reference handling, class unload handling information 100.254: [GC remark 100.254: [Finalize Marking, 0.0002228 secs] 100.254: [GC Ref - Proc, 0.0001515 secs] 100.254: 2, optimization, optimization] [Times: User =0.00 sys=0.00, real=0.00 secs] 100.255: user=0.00 sys=0.00, real=0.00 secs [GC cleanup 86M->86M(128M), 0.000536secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
Mixed collection log
// EvacuationPause (G1EvacuationPause)(Mixed) : EvacuationPause (G1EvacuationPause)(Mixed) : EvacuationPause (EvacuationPause) 122.132: [GC pause (G1 Evacuation pause) (mixed), 0.0106092 secs] [Parallel Time: 9.8ms, GC Workers: 8] [GC Worker Start (MS): Min: 122131.9, AVG: 122132.0, Max: 122132.0, Diff: 0.1] [Ext Root (MS): Min: 0.1, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.7] [Update RS (ms): Min: 0.5, Avg: 0.7, Max: 0.9, Diff: 0.4, Sum: 0.7] [Scan RS (MS): Min: 1.0, AVG: 1.3, Max: 1.5, Diff: 1] [Scan RS (MS): Min: 1.0, AVG: 1.3, Max: 1.5, Diff: 0.5, Sum: 10.4] [Code Root (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.0] [Code Root): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 60.9] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 60.9] 0.1] [Termination Attempts: Min: 92, AVG: 105.1, Max: 121, Diff: 29, Sum: 841] [GC Worker Other (MS): Min: 0.0, AVG: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1] [GC Worker Total (ms): Min: 9.7, Avg: 9.7, Max: 9.8, Diff: 0.1, Sum: 0.1] GC Worker End (MS): Min: 122141.7, Avg: 122141.7, Max: 122141.7, Diff: 0.0] 0.0ms] [Code Root Purge: 0.0ms] [Clear CT: 0.2ms] [Other: 0.2ms] [Choose CSet: 0.0ms] [Ref Proc: 0.1ms] [Ref Enq: 0.0ms] [Humongous Reclaim: 0.0ms] [Free CSET: 0.0ms] [Eden: 0.0ms] [Humongous Reclaim: 0.0ms] [Free CSET: 0.0ms] [Eden: 0.0ms] Survivors: 3072.0K(3072.0K)->0.0B(5120.0K) Survivors: 3072.0K->1024.0K Heap: 105.5M(128.0M)->104.0M(128.0M)] [Times: Survivors: 3072.0K(3072.0K)->0.0B(5120.0K) User sys = = 0.00 0.00, real = 0.01 secs]
Commonly used parameters
Only the most basic parameters are listed here. In most cases, the GC tuning is just some memory/scale/time, and there are few scenarios for tuning various threads. It is more about analyzing the modified code based on the GC log, so the default parameters are generally sufficient.
# start g1-xx :+UseG1GC # minimum heap memory -Xms8G # maximum heap memory -Xmx8G # metaspace initial value -XX:MetaspaceSize=256M # expected maximum pause time, Default 200ms-XX :MaxGCPauseMillis # is abbreviated as IHOP. The default value is 45. This value is the threshold for starting concurrent tokens, which in the old days used 45% of the heap memory to start concurrent tokens. # if it is too big, can lead to mixed gc can't keep up with the speed of memory allocation which may lead to a full gc - XX: InitiatingHeapOccupancyPercent # G1 automatically adjust IHOP refers to, -XX:+ G1USEADAPTIVEIHOP # can be used after JDK9 to unload the Class. This operation is time - taking. Default not open - XX: + ClassUnloadingWithConcurrentMark # multiple threads in parallel execution of Java. Lang. Ref. *, object reference before recycling processing - XX: - ParallelRefProcEnabled
Comparison with other recyclers
- Compared to Parallel GC, G1 has shorter incremental Parallel collection pause times
- Compared with CMS, G1 has no fragmentation problem and uses memory more efficiently due to its mobile algorithm
conclusion
G1’s full name is Garbage First, meaning “Garbage First.” What is garbage first?
When concurrent markup occurs, the region of the markup is sorted in descending order based on the number/size of living objects. When it comes to the moving process, the areas that are more efficient to move (more Garbage, fewer living objects, less to move) are selected as the collection, hence the name Garbage First
However, G1 does not belong to an efficient recycler. For the old mobile recycling algorithm, although there is no fragmentation problem, the efficiency is low. Because most old objects are alive, a large number of objects need to be moved per collection. The purge algorithm purges dead objects, so in terms of efficiency, the purge algorithm is better in the old age.
But because G1 is an incremental collection of controlled pauses, each pause is allowed, and for most applications, pause time is more important than throughput. Coupled with the various details of the G1 optimization, the efficiency has been very high.
Garbage collection series (full runnable C code)
- Garbage collection algorithm implementation – mark – clean (complete runnable C language code)
- Garbage collection algorithm implementation – reference counting (full runnable C language code)
- Garbage collection algorithm implementation – copy (complete runnable C language code)
- Garbage collection algorithm implementation – mark – collate (complete runnable C language code)
- Garbage collection algorithm implementation – generation collection (complete runnable C language code)
Reference & Information
- “Deep in the Java Virtual Machine: Algorithms and Implementation of JVM G1GC” by Narayo Nakamura
- Garbage collection algorithm and implementation – Nakamura Narayo, Aikawa Mitsuko, Takeuchi Izuo (author) Ding Ling (translator)
- “JVM G1 source code analysis and tuning” by Peng Chenghan
- Java Performance Companion – Charlie Hunt, Monica Beckwith, Poonam Parhar, Bengt Rutisson
- Garbage-First Garbage Collector – Oracle
- Getting Started with the G1 Garbage Collector – Oracle
- Understanding the JDK’s New Superfast Garbage Collectors
- G1: One Garbage Collector To Rule Them All – InfoQ
- GC Algorithms: Implementations – Plumbr
- JVM Garbage Collectors Benchmarks Report 19.12
- https://www.redhat.com/en/blog/part-1-introduction-g1-garbage-collector
- [[Hotspot VM] Consult the G1 algorithm – R big](https://hllvm-group.iteye.com…
- [[HotSpot VM] A Little Understanding on Incremental Update and SATB – R Big](https://hllvm-group.iteye.com…
- Own directory of VM posts -iteye -r large
Original is not easy, reprint please at the beginning of the famous article source and author. If my article is helpful to you, please encourage support from thumb up collection. The original linkhttps://segmentfault.com/a/1190000039411521