Interviewer: How about talking about the G1 garbage collector this time?

Candidate: Uh-huh, all right

Candidate: LAST time I mentioned that the downside of the CMS garbage collector is that it creates memory fragmentation & space that needs to be reserved

Candidates: These two issues are likely to result in long pauses, which means CMS pauses are “unpredictable”.

Candidate: G1 can be interpreted as an “upgrade” to the CMS garbage collector

Candidates: The G1 garbage collector can give you a pause time that you want to Stop The Word, and The G1 garbage collector will try to satisfy that

Candidate: Earlier when I introduced the JVM heap, I drew a graph. The memory distribution of the heap is isolated in “physical” space

Candidate: In the world of the G1 garbage collector, the heap is no longer divided in a “physical” form, but in a “logical” form

Candidate: However, the concept of “generational” as mentioned earlier works just as well in the G1 garbage collector world

Candidates: for example: new objects will be allocated to Eden, objects that have passed the default Minor GC of 15 times will be transferred to the old age if they survive, etc…

Candidate: Let me draw the “heap” spatial distribution of the G1 garbage collector world

Candidate: As you can see from the diagram, the heap is divided into multiple equal regions, each called a Region in G1

Candidates: old generation, new generation, Survivor, I don’t need to say more, do I? The rules are the same as the CMS

Candidates: In G1, there is also a Humongous Region, which is used to store extremely large objects (more than half of Region memory)

Candidate: Once no reference is found to point to a large object, it can be directly reclaimed in the Minor GC of the younger generation

Interviewer: HMM…

Candidate: In fact, if you think about it a little bit, you can understand why you should “subdivide” heap space into multiple small areas

Candidate: Garbage collectors like the old days were “physical” partitions of the heap

Candidate: If the heap space (memory) is large and each “garbage collection” requires a large area to be collected, the collection time is not manageable

Candidate: With multiple small zones, it is easy to control the collection time of these “small zones”

Interviewer: HMM…

Interviewer: I get the idea. Why don’t you tell us about its GC process?

Candidates: Well, in the G1 collector, there are Minor (Young) and Mixed GCS, but there are also special cases where Full GCS may occur

Candidate: Should I just say Minor GC first?

Interviewer: Well, go ahead

Candidate: G1’s Minor GC actually fires at the same time as the garbage collector mentioned earlier

Candidate: After Eden is full, Minor GC is triggered. Minor GC also happens to Stop The World

Candidates: One cavnote: in G1 world, the amount of heap space occupied by the Cenozoic and the old is less fixed (dynamically adjusted according to the “maximum pause time”)

Candidate: It would be nice to know that this will provide us with parameters to configure

Candidate: So dynamically changing the number of young generation regions can “control” the Minor GC overhead

Interviewer: Well, what about Minor’s collection process? Can you elaborate a little bit

I think it can be broken down into three simple steps: root scan, update && processing of rsets, and copying objects

Candidate: The first step should be easy to understand because it is similar to the previous CMS and can be understood as the initial markup process

Candidate: The second step involves the concept of an “Rset.

Interviewer: HMM…

Candidate: From the last time we talked about the CMS collection process, we also talked about the Minor GC, which uses a “cart table” to avoid a full table scan of older objects

Candidate: Because the Minor GC collects objects from the younger generation, but if there are objects from the older generation that reference the younger generation, then the objects referenced by the older generation cannot be collected either

Candidate: Again, this is a problem in G1 (Minor GC, after all). CMS is a card table, and THE store that G1 solves the problem of “cross-generation references” is generally called an RSet

Candidate: Just remember that rsets are stored in each Region. They record the relationship between objects that other regions reference the current Region.

Candidate: For a Region of the young generation, its RSet only holds references from the old generation.

Candidate: In the case of a Region of an older age, its RSet will only save references to it from the older age. (In the G1 garbage collector, references to the younger generation are always collected before the older generation is collected, so there is no need to save references to the younger generation.)

Interviewer: HMM…

Candidate: So the second step of the RSet concept, should be easy to understand?

Candidates: simply process the RSet information and scan, adding references to older objects that hold younger ones to GC Roots to avoid being recycled

Candidate: The third step is easy to understand: store the scanned surviving objects in the “empty Survivor zone” or “old age”, and clear the other Eden zones

Candidate: It should be mentioned here that in G1 there is another term called CSet.

Candidate: The full name is a Collection Set and holds a Region in a GC that will perform garbage Collection. All living objects in a CSet are transferred to another available Region

Candidate: At the end of the Minor GC, soft references, Weak references, JNI Weak references, and so on are processed to end the collection

Interviewer: Well, I see. It’s not difficult

Interviewer: I remember you mentioned Mixed GC before. Why don’t you talk about this process?

Candidate: Ok, no problem.

Candidate: Mixed GC is triggered when the heap usage reaches a certain threshold (default 45%, parameter determined)

Candidate: Mixed GC relies on Region data counted with “global concurrent markers”

Candidate: “Global concurrent marking” the process is very similar to CMS, the steps are: initial marking (STW), concurrent marking (STW), final marking (STW), and cleaning (STW)

Interviewer: It does. Why don’t you go on and talk about the specific process?

Candidate: MMMM, Mixed GC is a “Mixed” GC because it will definitely collect the younger generation and collect parts of the older generation.

Candidates: The first is “initial tagging”, which “shares” The Minor GC’s Stop The World (Mixed GC must occur Minor GC), reusing The “scan GC Roots” operation.

Candidates: In the process, both the old and the new will be swept away

Candidates: In general, the initial tagging process is relatively fast, since there is no retrospective traversal

Interviewer:…

Candidate: Next comes “concurrent tagging,” which doesn’t Stop The World

Candidate: The GC thread executes with the user thread, which is responsible for collecting information about live objects for each Region

Candidate: It takes time to trace down from GC Roots to find objects that survived the entire heap

Interviewer: HMM…

Candidate: Next comes the “re-mark” phase, which, like CMS, marks objects that have changed during the “concurrent mark” phase

Candidate: Is it easy?

Interviewer: Wait a minute

Interviewer: CMS should rescan all thread stacks and the entire young generation as root during the “re-marking” phase

Interviewer: As far as I know, G1 doesn’t seem to be like that. Do you understand this?

Candidate: Well, that’s not really the case in G1, where SATB algorithms are used to solve the problem of the “concurrent tagging” phase causing reference changes

Candidate: Can be simply interpreted as: at the beginning of the GC, it takes a “snapshot” of the surviving objects

Candidate: During the concurrency phase, write down the old reference value for each reference relationship change

Candidate: Then in the “relabeling” phase only the references to the “changed” block are scanned to see if objects are still alive and added to the “GC Roots”

Candidate: There is a small problem with the SATB algorithm, however: if G1 considers it alive at the beginning, it will not be collected in this GC, even though the object may have become garbage during the “concurrent phase.”

Candidate: So, the G1 could have floating garbage problems, too

Candidates: But overall, for The G1, it’s not a problem (after all, instead of trying to clean up all The junk at once, it focuses on stopping The World).

Interviewer: HMM…

Candidate: The final phase is The “cleanup” phase, which also stops The World and mainly counts and resets The marker state

Candidate: Determines how many regions are reclaimed by the GC based on the “pause prediction model” (essentially, the set pause time)

Candidates: Generally, Mixed GC will select all regions of the young generation and collect some regions of the old age with high recycling value (high recycling value actually means a lot of garbage)

Candidate: The final Mixed GC cleanup is done by “copying”

Candidate: So, not all garbage is collected at once, G1 will select regions based on pause time (:

Interviewer: Well, I have a general idea of the process

Interviewer: So when will G1 be full GC?

Candidate: If the Mixed GC cannot keep up with the speed at which the user threads allocate memory, and the Mixed GC becomes too full to continue, it will be relegated to the Serial Old GC to collect the entire GC heap

Candidate: However, this scenario is relatively small compared to CMS, since G1 has no CMS memory fragmentation issues.

Summary of this article (G1 garbage collector features) :

  • From “physical” generation to “logical” generation, the heap memory is “logically” divided into multiple regions
  • Use csets to store collections of recoverable regions
  • Use rsets to handle cross-generation references (note: Rsets do not preserve young-generation references)
  • G1 can be divided into Minor and Mixed GC and Full GC
  • The collection process of Minor GC can be divided into :(STW) scanning GC Roots, updating && processing Rset, and copying and clearing
  • Mixed GC relies on the “global concurrency mark” to obtain a CSet(retrievable Region) and then “copy clean”.
  • When describing the principle of G1, from a macro point of view, G1 is actually “global concurrent marking” and “copy alive objects”.
  • The SATB algorithm is used to deal with the problem that object references may be modified in the “concurrent marking” phase
  • Provides a pause time parameter for users to set (G1 tries to meet this pause time to adjust the number of regions reclaimed during GC)

Welcome to follow my wechat official account [Java3y] to talk about Java interview, on line interview series continue to update!

Online Interviewer – Mobile seriesTwo continuous updates a week!

Line – to – line interviewers – computer – end seriesTwo continuous updates a week!

Original is not easy!! Three times!!