If Java virtual machine tag-clearing algorithms, tag collation algorithms, replication algorithms, and generational algorithms are among the methodologies of GC collection algorithms, then the GC collector is an implementation of these methodologies.

This in-depth question is relatively rare in the interview process, but it is a good way to understand the algorithms above. It’s also a plus if you’re able to keep track of everything, and again, the interviewer doesn’t have much time.

Concept to prepare

Before learning about the Java GC collector, you need to understand some of the content and concepts. First of all, if you did not learn about the interviewer, Stop asking me about the “Java GC Garbage Collection mechanism”, you can learn about the basic algorithm methodology.

Here are a few concepts to help you along: Stop The World, Safepoint, Safe Region.

In order to ensure The accuracy of The analysis, all Java threads must be suspended during The analysis. Sun calls this event “Stop The World”.

So, when is a good time to pause? Not all threads can be paused for GC until certain points are reached, known as safepoints.

The safety points should not be set too low so that the GC waits too long, but not too high or you add a burden to the runtime.

Therefore, the selection of safety point is basically based on the program “whether it has the characteristics of the program for a long time” for the standard of selection. For example, the end of the loop, before the method returns/after the call instruction calling the method, where an exception may be thrown, and so on.

HotSpot uses active interrupts to let the thread of execution poll at run time to see if it needs to pause flags (set by GC) and suspend if it needs to.

For running threads, it is possible to actively run to the safe point and pause execution, but what about threads that are sleeping or blocking, which may have passed the safe point by the time they execute again, before the GC has completed garbage collection?

Hence the concept of a Safe region, an area within which references cannot be modified. For example, when a thread enters a security zone, it identifies itself as being in the security zone, and when it wakes up to leave, it checks to see if the GC is complete, if so, it can leave, and if not, it waits in the security zone.

Now that you understand the basic concepts above, let’s move on to the garbage collector.

Garbage collector classification

Let’s take a look at the eight garbage collectors and their applications for Hotspot.

A line between two collectors, indicating that they can be used together. The region of the collector indicates whether it belongs to a Cenozoic or an old-age collector. The ZGC is the new garbage collector introduced for Java11.

Default garbage collector

The default collector for different Java versions is as follows.

Serial collector

The Serial collector, the most basic and oldest, is a single-threaded collector. While garbage collection is in progress, all other worker threads must be suspended until the collection is complete. It’s called Stop The World.

ParNew collector

The ParNew collector is essentially a multithreaded version of the Serial collector. In addition to using multiple threads for garbage collection, The behavior includes all The control parameters available to The Serial collector, collection algorithms (copy algorithms), Stop The World, object allocation rules, collection policies, and so on, which are identical to The Serial collector and share a considerable amount of code.

Parallel avenge

The Parallel Scavenge collector is a new generation collector that uses replication algorithms similar to ParNew. But unlike other collectors, the focus is to achieve a manageable throughput.

Serial Old collector

Serial Old is an older version of the Serial collector, also a single-threaded collector that uses a mark-collation algorithm. Operation diagram is the same as Serial collector.

Parallel Old collector

The Parallel Old is an older version of the Parallel Avenge collector that uses multithreading and a “mark-and-collate” algorithm. It was only available in JDK 1.6.

CMS collector

CMS (Concurrent Mark and Sweep concurrent-mark-Sweep) is a collector whose goal is to obtain the shortest collection pause time. Based on concurrency, using a tag sweep algorithm, garbage collection is only done for older generations.

When the CMS collector works, the GC thread and the user thread execute concurrently as much as possible to reduce STW time.

The whole operation steps are divided into four steps: CMS initial mark, CMS Concurrent mark, CMS re-mark and CMS Concurrent sweep.

In The process shown above, both initial marking and re-marking trigger “Stop The World”.

The initial tag simply marks objects that GC Roots can be directly associated with, which is fast, single-threaded in Java7 and multi-threaded after Java8.

The tag phase is executed concurrently by the GC thread and the application thread, initializing the marked live objects, and then recursively marking the objects reachable by those objects.

The re-marking stage corrects the marking record of the part of the object that is marked during the concurrent marking because the user program continues to operate. The pause time of this stage is generally slightly longer than that of the initial marking stage, but much shorter than that of the concurrent marking.

Advantages: Concurrent collection, low pauses.

Disadvantages: Very CPU resource sensitive, unable to handle floating garbage, space debris caused by mark-sweep algorithms.

G1 collector

G1 (garbage-first) is a Garbage collector for server-side applications. Support garbage collection for Cenozoic and old chronosphere.

The collector can make full use of CPU and hardware to shorten the STW time, and has “consolidation space”, “predictable pause” and other features. For example, predictable pause times can be modeled to allow users to explicitly specify that no more than N milliseconds should be spent on garbage collection within a time segment of N milliseconds in length.

When using the G1 collector, the memory layout of the Java heap is very different from that of the other collectors. It divides the entire Java heap into independent regions of equal size. While the concept of new generation and old generation is retained, the new generation and old generation are no longer physically separated. They are all collections of partial (possibly discontinuous) regions.

G1 tracks the value of Garbage accumulation in each Region (the amount of space collected and the empirical value of the collection time), maintains a priority list in the background, and collects the most valuable Region (hence the name garbage-first) based on the allowed collection time.

The OPERATION of the G1 collector can be roughly divided into the following steps: Initial Marking, Concurrent Marking, Final Marking, and Live Data Counting and Evacuation.

In the overall process, the first few steps are very similar to the CMS process. Similarly, “Stop The World” is triggered during initial and final tagging.

In fact, the screening and collection stage can be executed concurrently with the user program. However, since only part of the Region is reclaimed, the time can be controlled by the user and the collection efficiency can be greatly improved by stopping the user thread.

ZGC collector

Z Garbage Collector (ZGC) is a scalable, low-latency, concurrent Garbage Collector. Introduced in Java11, using Linux 64-bit system.

It is designed to achieve the following goals: the pause time is no more than 10ms, the pause time does not increase with the heap size or live object size, and can handle memory sizes from a few hundred megabytes to several terabytes.

ZGC divides memory into regions, also known as ZPages. ZPages can be created and destroyed dynamically. They can also be resized dynamically (unlike the G1 GC) and are multiples of 2 MB. The following are Large groups of heap areas: Small (2 MB), Medium (32 MB), and Large (N * 2 MB).

These heap regions can occur multiple times in the ZGC heap. Medium and large areas are allocated consecutively, as shown below:

Unlike other GCS, the ZGC’s physical heap area can be mapped to a larger heap address space, which can include virtual memory.

The execution of the ZGC includes: marking (initial marking, concurrent marking, edge case handling), relocation (finding relocation blocks, root references relocation and updating, locating other objects concurrently and storing old and new address maps), remapping.

The initial tag and edge case processing in The tag raises “Stop The World”, as does The “root reference relocation and update” in The relocation.

The remapping flow chart is as follows:

The ZGC intends to support heap sizes with short application pause times. To achieve this, it uses techniques including colored 64-bit Pointers, load barriers, relocation, and remapping.

summary

This article introduces the garbage collector and related concepts for scenarios, which is a deep dive that can be further expanded horizontally or vertically.

A friend asked in the comments section, “What’s the use of learning all this? Of course, we are not only for the interview, but also for the JVM structure and Java8 JVM memory structure changes.

Yesterday in the deployment when there is a big project “Java. Lang. OutOfMemoryError: Metaspace” exception, if before learning the related content will be easy because the JVM Settings to locate Metaspace limit parameters, and set up small parameter values.

Finally, the Interviewer series is constantly being updated. Please follow the public account “Program New Vision” for the latest content.

Interviewer: Stop asking me about the Java Garbage Collector

The Interviewer series:

  • JVM Memory Structure In Detail
  • Interviewer: Stop asking me about Java GC garbage Collection
  • Java8 JVM memory structure changed, permanent generation to meta space
  • Interviewer, Stop asking me about “Java Garbage Collector”


Program new horizon