Learn more about the JVMS-CMS collector

“This article has participated in the good article call order activity, click to see: back end, big front end double track submission, 20,000 yuan prize pool for you to challenge!

preface

Where last time we looked at generational and garbage collection algorithms, this time we’ll look at one of the most important garbage collectors of the past: the CMS collector. Again, there’s a lot of content in this section.

This section focuses on a very common CMS garbage collector.

These reviews

In the last article, we explained the basic theory of generation, and explained the algorithm replication algorithm and tag sorting algorithm of the new generation and the old generation respectively. After that, we summarized the conditions for the new generation to enter the old age. In the last article, we introduced the type of reference, and practiced the questions and related answers.

An overview of the

  1. Before we talk about the CMS collector, a brief introduction to his golden partner, ParNew
  2. Describes the parameters of the CMS collector, as well as the core run-step section.
  3. Explains some details of how the CMS collector runs and what CMS parameters mean.
  4. Sort out a small number of common JVM issues

Golden partner ParNew

As the most commonly used garbage collector in the new generation, ParNew and the CMS collector were officially recommended before JDK 1.9, and are currently the most commonly used collector combination.

The ParNew collector itself is a multithreaded version of the Serial collector. The Serial collector and the Serial Old collector are too Old to be covered here, but that does not mean they are obsolete, and the key role of the Serial collector will be mentioned later in this article.

Finally, note that ParNew is the only garbage collector other than Serial that works with the CMS

Features:

  • And Serrial are only single and multi-threaded differences
  • The only garbage collector other than Serrial that works with the CMS

Problem solved:

Multi-threaded collector and single-threaded collector that good?

In general, a multi-threaded collector is more recommended for the server, while a single-threaded collector is preferred for the client. This is because using multiple threads on single-core machines will result in additional “context switching” operations, which will not improve performance but will degrade performance. At the same time, the client in most cases for multi-threaded requirements are not very high, so the client is more recommended to use a single thread.

Serial and ParNew which recycle is better?

As with the above question, it depends on whether the machine is multi-core or single-core. Most of the time, of course, multithreading will be used, because modern processors are well developed for multithreading.

Analysis:

In the client mode, -client represents the parameters required by the client, and -server is the parameters required by the server.

Server mode: Usually suitable for multi-core environments, such as Parnew with efficient utilization for multi-threaded garbage collection.

Client mode: Suitable for poor performance machines if it is a single core, since the client mode usually runs on a single core, and is suitable for the Serial collector because it is single-threaded and has no thread-switching overhead

CMS collector

Prior to JDk9, the most commonly used garbage collectors used mainly the mark-clean algorithm (not entirely the mark-clean algorithm). In order to ensure the efficiency of the operation, CMS will use the user thread and garbage collection thread concurrent execution mode for processing, is the first garbage collector to support the user thread and garbage collection thread concurrent.

As discussed in previous articles, mark-clean algorithms generate a lot of memory fragmentation, so why use mark-clean algorithms?

In fact, ACCORDING to a system parameter, CMS will determine how many garbage collection times to perform the defragmenting action, and this action needs to stop all the current user threads, and start the single-thread Serial collector to clean up the old memory fragments, and the defragmenting here is the use of tag-defragmenting.

But more often than not, CMS uses the mark-clear action.

CMS collector features:

  • It cannot be used alone. It needs to work with other collectors, and only with Serrial and ParNew collectors
  • In order to ensure the efficiency of the operation, the CMS will use the user thread and garbage collection thread to execute concurrently. It was also the first garbage collector to support concurrent user and garbage collection threads
  • Algorithm based on mark-clear.
  • A garbage collector that focuses on the shortest pause time

CMS main parameters:

  • -xx :ParallelGCThreads: limits the number of garbage collection threads. By default, the number of threads is (total CPU cores + 3) / 4, for example, 2 garbage collection threads for 8 cores
  • + usecms-compactatFullCollection (jdk9 enabled deprecation) : When enabled, run memory fragmentation and defragment after each FullGc. Defragment requires stopping the user thread. Will increase the entire stop the World time
  • -xx :CMSFullGCsBefore -compaction (Compaction enabled) : **+ usecms-compactatFullCollection ** This parameter controls how many times the fullCOLLECTION will be defragtized. The default is 0, indicating that the fullCOLLECTION will be defragtized each time.
  • – XX: CmsInitiatingOccupancyFranction: used to limit the old s memory footprint after more than how much of the action to garbage collection. 68% for JDK5 and 92% after JDK6.

CMS Operation Steps (key points)

It is easy to understand the four recycling steps of CMS, which are mainly four steps:

  1. Initial markup: This process is fast and requires stopping the world, traversing all objects and marking the initial GC root
  2. Concurrent tags: this process can be done concurrently with user threads, so for the system process is small, the influence of the main work is in the system when thread running through the gc root root node enumeration for object operation, tag object is alive, note the tag is also more rapid and simple, for the next step is need to mark
  3. Reschedule: Stop the world is required. This phase will continue to complete the action of the previous phase. The object marked in the previous step will be traversed twice, and the reschedule will survive.
  4. Concurrent cleanup: Concurrent with the user thread, responsible for collecting garbage objects that have no Gc root reference.

Describe the steps above you can see, CMS has had the very big progress, garbage collector can implement concurrent tags and subsequent finishing stage and the user threads execute concurrently (but more system resources), don’t interfere with the user thread object allocation operations, but need to pay attention to the initial tag and mark phase still need pause again.

Initial tag

Initial marking phase: You need to pause the user thread and start the garbage collection thread, but just collect the current old GC ROOT object, and the whole process is so fast that the user is barely aware of it.

It is important to note which objects will be GC ROOT and which will not, such as instance variables that are not GC ROOT objects, and the ROOT node enumeration that is not referenced will also be marked as garbage objects.

Which nodes can be gc root

  • Local variables themselves can be used as GC ROOT
  • The static variable can be thought of as the Gc Root
  • The loop for index of type Long will act as GT ROOT

Conclusion: When there is a reference to a method local variable or a static variable of the class, it is not collected by the garbage thread.

Concurrent tags

Concurrent marking phase: Can execute concurrently with user threads, the system process will continue to allocate objects in a virtual machine, and garbage collection threads will be according to the gc root for the old s validity tests object, the object tag as live objects or garbage objects, this stage is the most time consuming, but due to the concurrent execution and the user thread, the effect is not very big.

Note that this stage does not complete the marking of objects for garbage collection, as there may be living objects that become garbage objects, and garbage objects that become alive objects.

The difference between concurrency and parallelism in the JVM:

Parallelism: Refers to the relationship between multiple garbage collection threads

Concurrency: The relationship between the garbage collector and user threads

To mark

Re-marking phase: This phase also requires stop world, which is used to continue the action of the previous phase. In fact, it is used to mark and judge whether the object that has been marked in the second phase is alive again. This process is very fast, because it is the completion of the previous step.

Concurrent cleaning

Concurrent cleanup phase: This phase is also executed concurrently with the user thread. At this time, the user thread can continue to allocate the object while the garbage collector thread collects the garbage. This phase is also time-consuming, but the impact is not great because it is executed concurrently.

What problems the CMS collector causes

Thread resource occupied by garbage collection thread (CPU usage problem)

The two phases of concurrent marking and concurrent cleanup require concurrent processing with user threads, which requires a portion of the entire system resources to be reserved for concurrent processing by garbage threads.

An obvious problem here is that in a single-core, single-threaded system, the CMS internally uses preemptive multitasking to simulate multi-core parallelization and turns on incremental collectors for threaded processing. However, the collector i-CMS did not work well and was deprecated in JDK7 and completely deleted in JDK9.

Single-core, single-threaded machines need to be carefully considered to use a CMS.

Concurrent Mode Fail

CMS is a diligent young man, and usually carries out the operation of garbage collection in an orderly manner. However, when there is too much garbage for the young man to bear, the old man who watches everything behind him cries out: Stop the world, and do a quick garbage collection, all done, step back in the background, and let the guy get back to work.

Of course, the above case is not a personal creation, personal learning to see a very vivid metaphor, of course, we can not explain it, this is not a professional to say.

When the user thread and garbage collector thread are running concurrently, since the second and fourth steps are running at the same time, if the user thread and garbage collector thread are working together, the user thread will allocate more memory than the old one, which will cause OOM problems. So before CMS will default according to the introduction of the CMS parameters – xx: specifies how old s memory footprint after cmsInitiatingOccupactAtFullCollection for garbage collection.

In the Jdk – xx: cmsInitiatingOccupactAtFullCollection parameters in jdk5 is 68%, while jdk6 has adjusted to 92%.

A Concurrent Mode Fail problem occurs when the garbage collection thread is working while the user thread is allocating more objects than the remaining memory (such as the last 8% of the space) during the Concurrent cleanup phase. Stop World immediately pauses the user process and starts the Serial collector ** for garbage collection. When garbage collection is complete, the user-user thread is started and work on the CMS collector resumes.

This ratio needs to be carefully adjusted during actual use to prevent concurrency failures.

You can see that the Serial collector operates as a pocketbook, and one wonders why pocketbook uses Serial as a single-threaded garbage collector and not other garbage collectors.

This question is actually easy to answer. Just like Redis, single thread does not necessarily mean poor performance, and multi-thread does not also mean good performance. Although Serial, as an old garbage collector, is simple to implement, it has an advantage that other collectors do not, namely high efficiency and good performance. So this is why you would use Serial as a pocket instead of using other garbage collectors.

Memory fragments

This problem is due to the CMS using the tag itself – clear algorithm implementation, concurrent tags and concurrent cleanup phase are for direct marking, and recycling of waste objects, in mark phase is only for gc root has labeled objects to make a judgement, all the process won’t produce the movement of the object operation, As a result, the memory objects are scattered all over the place. If a new generation of large objects comes in, it is easy to cause frequent FULL GC.

The official solution is to do a “tag-decontamination” of memory after each decontamination, which also requires “Stop World” to pause the user thread, move the living objects to one place, and clean up any garbage objects.

The Jdk provides the following: -xx: cmsfullgCbefore-compaction parameter: this parameter specifies how many times a full GC should gather memory. The default value is 0, meaning that it should gather memory every time.

Problem sorting:

What are The Times that trigger vintage recycling?

This point has been mentioned many times, but it is mentioned here again, along with the addition of an old Full GC trigger when using the CMS collector.

  1. The available continuous space in the old age is smaller than the size of all objects in the new age
  2. The available continuous space in the old age is smaller than the average size of the Cenozoic generation in the old age
  3. When the Cenozoic memory minor GC cannot enter the Survior region and the old age space is insufficient
  4. – xx: cmsInitiatingOccupactAtFullCollection in Cms under the condition of the garbage collector, if concurrent cleanup phase space the size of the object is assigned to more than 8% of the final size, will trigger the concurrent Fail results in failure.

Question to consider: Why is the old generation recycling so much slower than the new generation, and why?

  • First of all, there are many memory objects in the old age, and the speed of GC ROOT is very slow, and the garbage collection time is prolonged.
  • After cleaning, the mark-sorting algorithm needs to move a large number of objects to one place, and at the same time, it needs to update the cross-generational reference and the reference address of the object, which takes a long time. However, the new generation of replication algorithm has relatively small objects at the same time, so the algorithm directly copies the surviving objects and then cleans up the Eden region, leaving few objects to enter the old age at last.
  • If the marker cleanup algorithm is used, memory fragmentation will result. If there are too many fragments, you need to stop the thread for moving and defragmenting.

This problem mainly starts from the algorithm and the number of objects, the new generation of replication algorithm and the old tag-sorting algorithm need different time overhead, at the same time, the old itself has too many objects, and combined with the characteristics of the JVM mainly using root node enumeration, will inevitably cause user threads to pause and wait. Even the latest generation collectors (ZGC and Shenadash) can do almost complete concurrency with the user thread, and the root node enumeration step still requires suspending the user thread. As a result, old age recycling is slow and we need to do our best to avoid old age triggering garbage collection.

“Useless classes” (method area) collected by GC:

Here again to emphasize the method area recycling criteria:

1. All instances of this class have been reclaimed, that is, there are no instances of this class in the Java heap

2. The ClassLoader that loaded the class has been reclaimed

The java.lang.Class object corresponding to this Class is not referenced anywhere, and the methods of this Class cannot be accessed anywhere by reflection

From www.cnblogs.com/erma0-007/p…

Write in the last

The details of the garbage collector are extensive, so this is a long article, but the CMS garbage collector is an important and noteworthy collector.

As you can see from this section, vintage recycling can have a significant side effect on CMS, so the next section will explain some ideas for avoiding vintage recycling based on a simulated case.