The default JVM for this article is HotSpot. We’ve seen common garbage collection algorithms before, but let’s look at how the JVM implements garbage collectors for these algorithms.

The following figure shows the HotSpot VIRTUAL machine with a garbage collector. The connected representations can be used with each other and not the other way around.

New generation garbage collector

Serial

This old garbage collector is used for the new generation of garbage collection, using a copy algorithm. A single-threaded garbage collector means that no matter how many cpus your server has, it uses one of them to start a thread to handle the garbage collection, and stops all worker threads until the collection completes. So it STW(Stop the world) when collecting. Older collectors that can be paired with them are CMS and Serial Old.

ParNew

This garbage collector is a multithreaded version of Serial, which differs from Serial only in that the copying is multithreaded.

Parallel Scavenge

Scavenge. En parallel garbage said is like yes, with a copy algorithm. That’s already in ParNew, how can we have a parallel. It differs from ParNew in two main ways

The focus is on controlled throughput = time to run user code /(time to run user code +GC time). The point is not to shorten the time per GC, but to control the total amount of time the VIRTUAL machine spends on GC over a period of time. For example, if the program runs for 100 minutes and garbage is collected for 1 minute, the throughput is 99%.

To adapt the parameters configured in the new generation, such as Eden and survivor ratio. In fact, because it can be adaptive, so it can control the throughput, it dynamically adjusts these parameters according to the actual situation to achieve the required throughput.

This collector also provides “-xx :MaxGCPauseMillis” to control the maximum garbage collection pause time (allowed values greater than 0) and “-xx :GCTimeRatio” throughput (1-99). When you see “-xx :MaxGCPauseMillis”, don’t assume that we can set as many as we want, the collector can only guarantee as much as possible. And to put it bluntly, if you want to improve the speed of the new generation GC, that is to reduce the memory space of the new generation, less memory space must be less garbage processing must be faster. However, less space is not faster and easier to fill up, so the number of GC times required will definitely increase and the throughput will also decrease.

For example, if a program is running on the server and each new generation GC takes 100 ms, and every 10 seconds a new generation GC takes 600 ms for each minute of GC. I would like to spend less time on GC per session like 60 ms, which would reduce the memory space of the new generation, but with a GC every 5 seconds, that’s 720 ms per minute spent on GC.

The application is insane if your service is computationally based, computationally in the background, with minimal interaction with the user, so you want to be insane.

Then if your program is interactive, your requirements must be as short as possible STW time, can quickly respond to customer requests. Insane, but not to be used in conjunction with CMS. The Parallel Insane does not use the SAME GC framework that HotSpot uses with other GCS. So the default that goes with CMS is ParNew.

Old age garbage collector

Serial Old

It is an older version of the Serial collector, a single-threaded collection that uses a mark-collation algorithm. Mainly used as a backup collector for client mode and CMS. With the exception of the G1, all of the new generation collectors mentioned above can be used with it. Please refer to Serial above for figure.

Paraller Old

It is an older version of the Parallel Avenge, a multi-threaded collection using a mark-collation algorithm. It can only be used with the Parallel Insane. The Parallel Exploiter was created to break the cult of the Insane, the Parallel Exploiter being insane and the Serial Old insane. Refer to Parallel Avenge above.

CMS

CMS(Concurrent Mark Sweep), as its name implies, uses a mark-sweep algorithm. It aims to reduce STW time and allow user threads to run in parallel while garbage is collected. In the current Server mainstream garbage collector.

1. Initial mark (STW)

2. Concurrent markup

3. Re-marking (STW)

4. Concurrent cleanup

Initial tagging is to mark only objects directly associated with GC Roots and not further tagging, aiming to reduce STW time. Concurrent markup is a deep markup that traverses all subsequent associated objects. Relabeling is to correct the object STW that has changed due to the concurrent marking phase. Then there’s the concurrent garbage cleanup.

So CMS parallel the deep markup and cleanup phases, which take the longest time, with the user thread. Greatly reduces the time required for STW.

But it has the following three disadvantages:

1. The concurrency phase competes with worker threads for CPU resources

2. Space debris problem, because the mark-clear algorithm is adopted, space debris will be generated. Why to solve this problem a CMS provides “- XX: + UseCMSCompactAtFullCollection” (open) by default, when the CMS FullGC hold needed space debris, but the process of finishing is user thread is to stop work, so the pause time get longer.

Floating garbage problem. Concurrent cleanup allows the user thread to continue executing, which may create new garbage that enters the old age. Therefore, it is necessary to set aside some space for floating garbage. When too many floating garbage burst during CMS operation, the CMS will suffer a Concurrent Mode Failure. This is the time to back up the Serial Old garbage collection, so the pause time is longer.

G1

This garbage collector does not need to cooperate with anyone else and handles the new generation and the old generation itself. G1 becomes the default garbage collector in Server mode in JDK9. It was invented to replace CMS.

G1 (garbage-first) is a mark-collation algorithm based on the whole, and a replication algorithm based on the part. It can run in parallel with user processes as well as CMS. Its advantages over CMS are that it can model predictable pause times, specify garbage collection within a specified number of milliseconds within a specified period of time, and divide the Java heap into independent regions of equal size called regions. Although it retains the concept of generations, the new generation is no longer physically separated from the old. Instead of the whole Cenozoic or old age, it is divided into regions and does not generate space debris.

G1 maintains a priority list. The region with the highest priority is reclaimed first according to the space size and time generated by region reclamation. This means that the collection target of each time is more precise and the efficiency of recycling is improved. G1 collection steps can be divided into:

1. Initial mark

2. Concurrent markup

3. Final marking

4. Screening and recycling

The initial tag, like the CMS, marks the GC Roots directly associated object, and then goes deep into the tag to iterate over the associated object. Finally, the tag and CMS re-label a concept, filter collection, that is, filter to decide which Region is more valuable to reclaim.


If there is a mistake welcome to correct! Personal public account: Yes training level guide