A Concurrent Mark sweep(CMS) collector is an tenured garbage collector whose primary goal is to obtain the minimum garbage collection pause time. Unlike other tenured mark-collation algorithms, it uses a multithreaded mark-sweep algorithm. You can operate concurrently with the user thread during collection. It can be used in conjunction with the Serial collector and Parallel New collector. CMS sacrifices system throughput in pursuit of collection speed and is suitable for servers that pursue garbage collection speed. CMS can be started with the JVM startup parameter: -xx :+UseConcMarkSweepGC.
Minimum garbage collection pauses can improve the user experience for highly interactive applications.
CMS garbage collection features
-
The CMS collector only collects old ages, which trade throughput for collection speed.
-
CMS collection process is divided into initial marking, concurrent marking, pre-cleaning stage, terminable pre-cleaning, re-marking and concurrent cleaning stage. Where initial marking and re-marking are of STW. The CMS spends most of its time in the re-marking phase, so the virtual machine can do a Young GC first to reduce the pause time. CMS cannot solve the floating garbage problem.
-
Because the CMS has concurrent collection threads and user threads, there may be a “Concurrent mode failure” during the collection process, and the solution is to let the CMS GC as early as possible. Let the CMS compress memory after a certain number of Full GC sessions to reduce memory fragmentation and prevent young objects from failing to be promoted to the old generation due to memory fragmentation.
CMS Collection Process
- The initial mark (CMS-initial-mark), which results in SWT. Just mark objects that GC Roots can directly associate with, which is fast, and still suspend all worker threads;
- Concurrent marking (CMS-concurrent-mark), the process of GC Roots tracing, runs concurrently with the user thread;
- Preclean (cmS-concurrent-preclean), running at the same time as the user thread;
- Abortable preclean (cmS-concurrent-abortable-preclean) runs at the same time as the user thread;
- Re-marking (CMS-remark) will result in SWT. All worker threads need to be suspended in order to correct the mark record for that part of the object that changes the mark as the user program continues to run during concurrent marking;
- Concurrent sweep (CMs-concurrent-sweep). Clear unreachable GC Roots objects, run concurrently with the user thread, without suspending the worker thread. The garbage collection thread can now work concurrently with the user, thanks to the longest parallel tagging and concurrent cleanup processes;
- Concurrent state reset waits for the next CMS trigger (cms-concurrent-reset), running concurrently with the user thread;
The CMS operation flowchart is as follows:
Initial tag
This is one of two stop-the-world events in CMS. This step marks alive objects and has two parts:
- Mark all GC Roots objects in the old age, node 1 in the figure below;
- Nodes 2 and 3 in the figure below mark objects of the old age referenced by living objects in the young generation (referring to objects of the reference type that are alive in the young band and refer to objects in the old age).
In the Java language, GC Roots objects include the following:
- Objects referenced in the virtual machine stack (the local variable table in the frame);
- The object referenced by the class static attribute in the method area;
- Objects referenced by constants in the method area;
- Objects referenced by JNI in the local method stack;
Ps: in order to speed up the phase processing speed, reduce the pause time, can open and initial tag – XX: + CMSParallelInitialMarkEnabled, augmenting the parallel marks the number of threads at the same time, the number of threads do not exceed the CPU’s auditing.
Concurrent tags
Find all surviving objects starting with those marked in the “initial mark” phase; Because it is run concurrently, during operation, objects of the new generation will be promoted to the old age, or directly assigned to the old age, or update the reference relationship of the old age object, etc. For these objects, they need to be re-marked, otherwise some objects will be omitted and the mark will be missed. In order to improve the efficiency of re-marking, the Card where the above objects reside is identified as Dirty. In the future, only the objects of these Dirty cards are scanned, avoiding scanning the whole old age. The concurrent marking stage is only responsible for marking the Card whose reference has been changed as Dirty, and is not responsible for processing; Nodes 1, 2, and 3, as shown below, eventually find nodes 4 and 5. The characteristic of the concurrent tag is to run concurrently with the application thread. Not all living objects from older generations will be marked, as the application will change some object references and so on. Because this phase is concurrent with the user thread, a concurrent mode failure can occur.
Pre-cleaning stage
Previous stage has shown, it is not marked the old s all live objects, because of the tag application will change some object reference at the same time, the stage is used to deal with the previous stage the no tag because the reference relationship changed to the survival of the object, it will scan all Card marked as Dirty as shown in the figure below, in concurrent clean up the stage, The reference to node 3 points to 6; The card on node 3 is marked as Dirty.
Finally, mark 6 as alive, as shown below:
Terminable preprocessing
This stage tries to undertake enough work for the next stage Final Remark stage. The duration of this phase depends on a number of factors, since this phase is repeated until one of the abort conditions (e.g., number of repetitions, amount of work, duration, etc.). Ps: The maximum duration of this stage is 5 seconds. Another reason why this stage can last for 5 seconds is to expect a YGC to occur within 5 seconds to clear references of the young band. In the next stage of re-marking, the time of scanning references of the young band pointing to the old era is reduced.
To mark
This phase results in a second Stop the Word phase, where the task is to finish marking all living objects for the entire ten-year-old generation. At this stage, the memory scope for re-marking is the entire heap, including _young_gen and _old_gen. If objects in the old generation are referenced by objects in the new generation, they will be regarded as living objects. Even if objects in the new generation are unreachable, unreachable objects will be used as the “GC root” of CMS to scan the old generation. Therefore, for the old age, objects of the new generation that reference objects of the old age will also be regarded by the old age as “GC ROOTS” : when this stage takes a long time, you can add the parameter -xx :+CMSScavengeBeforeRemark, and perform yGC before re-marking. Recycle the useless objects of the young band, and put the objects into the survival band or promote to the old age, so that the scan of the young band, only need to scan the objects of the survival zone, generally very small, which greatly reduces the scanning time. Since the previous pre-processing stage is executed concurrently with the user thread, many changes may have taken place in the reference of the young object to the old age. At this time, it will take a lot of time to process these changes in the remark stage, which will lead to a long stop of the word. So CMS usually tries to run Final Remark when the young generation is clean enough. The parallel collection can also be enabled: -xx :+ CMSPARallelEnabled.
Concurrent cleaning
With the above five phases of tagging, all surviving objects from the old age have been tagged and the unusable objects are now scavenged through a Garbage Collector. This phase is mainly about clearing unmarked objects and reclaiming space; Because the CMS concurrent cleanup phase user threads are still running, new garbage is naturally generated as the program runs. This part of garbage is generated after the marking process, and the CMS cannot dispose of it in the current collection, so it has to be cleaned up in the next GC. This part of garbage is called “floating garbage”.