Garbage First Collector

G1 is a little different from the garbage collector you learned before

  • First, there are some differences in how the heap is divided
    • Other collectors divide the memory into fixed sizes, physically into the New generation (Eden and two Serviors) and the old.
    • G1 is a logical partition; there is no partition going on in memory.

Some of the concepts

throughput

  • Throughput is concerned with maximizing the workload of an application in a given amount of time.
  • The throughput of a system can be measured as follows:
    • Number of times a transaction (or task, request) is completed within an hour (TPS)
    • How many queries can the database complete in an hour
  • Caton (STW) is acceptable for systems that focus on throughput because they focus on the ability to execute a large number of tasks over a long period of time, and a single quick response is not worth considering.

responsiveness

  • Responsiveness refers to the ability of a program or system to respond promptly to requests, as in:
    • How fast a desktop UI can respond to an event.
    • How fast can a web site return a page request how fast can the database return the query data
  • For such responsivity sensitive scenarios, long pauses are unacceptable.

Introduction of G1

Applicable scenario

  • The G1 collector is a service-oriented garbage collector for server systems with multi-core processors and large memory capacities.
  • It meets theShort GC pauses while achieving a high throughput.
    • High response and high throughput.
  • JDK7 or later is applicable.

Design goals for the G1 collector

  • Working simultaneously with the application thread, there is little need to stop theworld(similar to CMS).
  • Defragment free space without memory fragmentation (CMS can only defragment using Stop the World at FullGC).
  • GC pauses are more manageable.
    • In the case of CMS, the STW at Full GC cannot be estimated.
    • G1 can set the maximum pause (STW) time, which it will try to match.
    • Thanks to the G1’s memory design, Full GC doesn’t have to run completely.
  • Without sacrificing system throughput.
  • The GC does not require additional memory space (the CMS needs space to store floating garbage).
    • Because the CMS is in parallel with the user thread, there may be things that are already recognized as not garbage that are lost by the user thread.

The G1 was designed to replace CMS

  • The G1 makes up for CMS in some ways, such as using the Mark-Sweep algorithm, which naturally generates memory fragmentation.
  • The G1, on the other hand, is based on algorithms that efficiently sort out leftover memory without having to manage memory debris.
  • In addition, G1 provides more means to achieve control over GC pause times.
  • But on a case-by-case basis, CMS can sometimes be better than G1.

G1 collector heap structure

  • The heap is divided into equal and discontinuous regions, and each region has a generational role: Eden, survivor, and Old
  • There is no mandatory limit on the number of roles, which means that the size of memory for each generation can be dynamically changed. The most important feature of G1 is that it performs reclamation efficiently, prioritizing regions with a large number of objects that can be reclaimed.
  • G1 uses a predictable gc pause model to meet the user-defined GC pause time. Based on the user-defined target time, G1 automatically selects which regions to clear and how many regions to clear at a time.
  • G1 copies live objects from multiple regions, and thenconcentratedPut in a region and sort out or clean out memory simultaneously
    • It’s like moving the content in Eden to Servivor, sort of.

G1 vs CMS

  • In contrast to CMS using Mark-Sweep, the ONE using copying algorithms does not cause memory fragmentation.
  • Compare the Parallel Avenge, the Parallel Old collector, which tends to collate whole regions and result in long GC pauses, whereas G1 collates only a few regions specifically.
  • G1 is not a real-time collector and, like the Parallel Exploiture, the GC break times are not set strictly, but G1 has a high probability of not exceeding the set GC break times.
  • In contrast to the previous GC collector, G1 intelligently evaluates which regions need to be collected to meet the user’s requirements based on the gc pause times specified by the user.

Partition (Region)

  • G1 takes a different strategy to solve the problems of fragmentation of parallel, serial, and CMS collectors, and uncontrolled pause times — G1 divides the entire heap into regions of the same size.

On G1 generation

  • Each partition can be a young generation or an old generation, but can only belong to one generation at a time.
  • The concepts of young generation, surviving zone, and old age still exist and become logical concepts, making it easy to reuse the logic of the previous generational framework.
  • It has the added benefit of not having to be physically contiguous — some partitions have a lot of garbage objects in them, and some have a lot of garbage objects in them, so that G1 can spend less time collecting garbage from those partitions. That is, the partition that collects the most Garbage First.
  • When the new generation is full, the whole generation is recycled — the objects of the whole new generation are either recycled or promoted. The reason why the new generation also adopts the partition mechanism is that it is consistent with the strategy of the old generation and convenient to adjust the size of the generation.
  • G1 is also a collector with compression, reclaiming older partitions by copying surviving objects from one partition to another, a process that achieves local compression.

Cset

  • Collection set (CSet) : A collection of partitions that can be reclaimed. Data that survives in a CSet is moved to another availability zone during GC, and partitions in a CSet can come from Eden space, survivor space, or old age.

Rset

  • RSet memory set:RSet records the relationship between objects in other regions and objects in this Region, belong toPoints-into structure (who references my object). The value of the RSet is that the garbage collector does not need to scan the entire heap to find out who references objects in the current partition, just scan the RSet.
    • The record is the relationship, who references my partition.
    • It takes extra space.

  • G1 GC adds a layer of structure to the points-out card table to form a points-into RSet: each region records which other regions have Pointers to it, and these Pointers are in the range of which cards.
  • The RSetIt’s actually a hash table.Key is the start address of another region.Value is a collection.The element inside is the index of the card table.
    • For example, if the key of region A’s RSet is Region B and the value contains A card with index 1234, it means that A card of Region B has A reference to Region A. Therefore, for Region A, the RSet records the points-into relationship. The card table still records the points-out relationship.

SATB

  • Snapshot-at-the-beginning (SATB):SATB is an incremental marking algorithm used by The G1 GC in The concurrent marking phase.
  • Concurrent flags are concurrent multithreading, but only one partition is scanned by concurrent threads at a time.

Concept of summary

  • Region
  • Cset
  • Rset
  • SATB

For more definitive and accurate information on the G1, read the official documentation

  • Getting Started with the G1 Garbage Collector (oracle.com)

Summarize the advantages of G1 over CMS

Advantages of summary

  • The G1 has the advantage of squeezing space.
  • The G1 avoids memory fragmentation by dividing the memory space into regions
  • Eden, Survivor, and Old regions are no longer fixed and more flexible in terms of memory efficiency
  • G1 controls garbage collection Time by setting the expected Pause Time to avoid avalanche.
  • G1 will merge free memory as soon as it reclaims memory, while CMS will merge free memory at STW (Stop the World) by default
  • G1 will be used in the Young GC, while CMS can only be used in the O sector.
    • The young generation also uses the copy algorithm

Summary of G1 application scenarios

  • Server multi-core CPU and JVM memory usage is large.
  • During application running, a large number of memory fragments are generated and the space needs to be compressed frequently.
  • You want more controlled, predictable GC pause cycles.
  • Prevent avalanches in applications with high concurrency.

G1 Garbage collection steps

  • Some key Technologies for Java Hotspot G1 GC – Meituan.com