Personal blog project address

Hope everybody helps point a star, add a small star to me ✨


The garbage collection algorithm introduced in the previous article is the methodology of memory collection, and the garbage collector is the concrete implementation of garbage collection.

The garbage collector can vary widely from vendor to vendor and from version to version, but the focus of the book is on the HotSpot virtual machine garbage collector (the G1 collector is officially available commercially in JDK1.7).

(Image source network, deleted from infringement)

The figure shows seven collectors of different ages. If there is a line between two collectors, they can be used together. The region in which the virtual machine is located indicates whether it belongs to the new generation or the old generation collector.


Serial collector

The Serial collector is the most basic and oldest collector, and was once (prior to JDK1.3.1) the only collection option for the new generation of virtual machines.

  • Summary: This collector is a single-threaded collector, but its “single-threaded” meaning is not only that it uses only one CPU or one collection thread to complete garbage collection, but also that it must suspend all other worker threads while it is garbage collecting until it is finished. Hence The title “Stop The World” (not cool at all because you’re suspending all other threads =-=)
  • Application scenario: It is still the default generation collector for VMS running in Client mode
  • Advantages over other collectors: Simple and efficient (compared to the single-threaded efforts of other collectors), Serial collectors naturally achieve the highest single-threaded collection efficiency in a single-CPU-constrained environment because of the lack of overhead of thread interaction.

Second, ParNew collector

  • Brief introduction: The ParNew collector is a multithreaded version of The Serial collector. In addition to using multiple threads for garbage collection, The rest of The behavior of The Serial collector includes all The control parameters available to The Serial collector, collection algorithms, Stop The World, object allocation rules, collection policies, and so on are exactly The same as The Serial collector.

  • Application scenario: The preferred next-generation collector in virtual machines in Server mode, for one reason, is currently the only one that works with the CMS collector besides the Serial collector. Unfortunately, CMS, as a collector of older ages, does not work with the Parallel Scavenge collector, which is already available in JDK1.4.0, so when using CMS to collect older ages in JDK1.5, the new generation collector has to choose either ParNew or Serial. The ParNew collector is also the default new generation collector with the ** -xx :+UseConcMarkSweepGC option, which can also be enforced with the -xx :+UseParNewGC** option.

  • ParNew vs. Serial: The ParNew collector is by no means better than the Serial collector in a single-CPU environment, and even with the overhead of thread interaction, the collector is not 100 percent guaranteed to outperform the Serial collector in a hyperthreaded environment of both cpus. Of course, with the increase in the number of cpus available, it is beneficial for efficient utilization of system resources during GC. By default, the number of threads for garbage collection is the same as the number of cpus. In the case of very many cpus, you can limit the number of threads for garbage collection by using the ** -xx :ParallelGCThreads parameter **.


The Avenge: insane

  • The Parallel Collector is a new generation collector. It is the collector using the replication algorithm. It is a Parallel multi-threaded collector.

  • Application scenarios: The shorter the pause time is, the more suitable for the program that needs to interact with the user. The high throughput can efficiently use the CPU time to complete the program’s computing tasks as soon as possible. It is mainly suitable for the tasks that need not too much interaction in the background.

  • Comparative analysis:

    • The Parallel Collector is different from the CMS collector, whose focus is to minimize the amount of time the user threads are stuck during garbage collection. The goal of the Parallel Insane is to achieve a controlled Throughput. Throughput is the ratio of the CPU time spent running user code to the total CPU consumption, i.e. Throughput = user code run time /(elapsed user code run time + garbage collection time). The Parallel Avenge collector is also often referred to as a “through-first” collector due to its affinity for throughput.
    • The Parallel Scavenge collector and ParNew Collector. The Parallel Scavenge collector has a parameter ** -xx :+UserAdaptiveSizePolicy**, which is a switch parameter, when this parameter is turned on, Don’t need to manually specify the size of the Cenozoic (Xmn), Eden and Survivor area ratio (- XX: SurvivorRatio) object size, promotion old s (- XX: PretenureSizeThreadhold) detail parameters, such as, Dynamically adjusting these parameters to provide the most appropriate pause times or maximum throughput is called GC-adaptive tuning strategy (GERGonomics)

Serial Old collector **

  • Serial Old is an older version of the Serial collector, which is also a single-threaded collector using the ** mark-tidy algorithm **.

  • Application scenario: The main significance of this collector is also for the use of virtual machines in Client mode. If in Server mode, it has two other uses:

    • Use with the Parallel exploiter in JDK1.5 and earlier.
    • As a fallback to the CMS collector in the event of Concurrent Mode Failure in the Concurrent collector.

Parallel Old collector

  • The Parallel Old is an older version of the Parallel Avenge collector, which uses multithreading and the “mark-and-collate” method. The collector was only introduced in JDK1.6, and up until that time, the new generation of the Parallel exploder was the embarrassment of the insane, because if the new generation of the Parallel exploder were to choose the insane, The Old days had no choice but to Serial Old (PS MarkSweep) collectors. The Parallel collector may not maximize throughput on the whole application due to the performance of the Serial collector on the server. The Parallel collector can not take advantage of the multi-CPU processing power of the server due to the single-threaded aging collection. In older environments with large and more advanced hardware, the throughput of this combination may not even be as good as ParNew plus CMS.

  • Application scenarios: The Parallel Avenge collector was developed to support the Application of the Parallel Avenge collector. The Parallel Avenge collector can be used as a superior application to the throughput and CPU resource sensitive applications.


CMS collector **

  • Brief introduction: THE CMS (Concurrent Mark Sweep) collector is a collector whose goal is to obtain the shortest collection pause time. CMS collector is based on ** “mark-clean” ** algorithm implementation, the whole process is divided into four steps:

    • CMS Initial mark simply marks objects that GC Roots can be directly associated with, which is fast.

    • CMS Concurrent Mark for THE process of GC Roots Tracing

    • CMS Remark Pauses during concurrent marking are generally slightly longer than the initial marking phase, but much shorter than the concurrent marking phase, in order to correct the marking record of the part of the object whose marking changes because the user program continues to operate.

  • CMS Concurrent sweep

The initial marking and re-marking steps still require ** “Stop The World”. Since the collector thread, the longest concurrent marking and concurrent cleaning process, can work with the user thread, the CMS collector’s memory reclamation process is, in general, executed concurrently with the user thread.

  • Advantages: Concurrent collection, low pauses

  • Disadvantages:

  • The CMS collector is very CPU sensitive. In the concurrent phase, it does not cause user threads to stall, but it can slow down the application by occupying a portion of the threads (or CPU resources), resulting in lower overall throughput. The default collection thread of CMS is (number of cpus +3) / 4, that is, when the number of cpus is more than 4, the garbage collection thread is not less than 25% of the CPU resources, and it decreases with the increase of the number of cpus. But when there are fewer than four cpus, the CMS’s impact on user programs can become even greater.

  • The Failure of the CMS collector to handle Floating Garbage can result in a “Concurrent Mode Failure” and result in another Full GC. As the user thread is still running during the CMS concurrent cleanup phase, new garbage is naturally generated along with the program running. After this part of garbage appears in the marking process, CMS cannot dispose of it in the current collection, so it has to be cleaned up in the next GC. This part of garbage becomes “floating garbage”. Because the user thread in garbage collection phase still need to run, it will also need to set aside enough memory space for user thread is used, as well as any collector so CMS collector can’t wait for old age almost complete fill up and then to collect, need to set aside part of space to provide operational use concurrent collection program.

  • A “mark-clean” algorithm means that a large amount of space debris will be generated at the end of the collection. When space debris is too much, it will bring great trouble to the allocation of large objects. Often, there will be a large amount of space left in the old years, but they cannot find a large enough continuous space to allocate the current object, and they have to trigger a Full GC in advance.


G1 collector **

  • Summary: **G1 (garbage-first) ** collector is one of the most advanced achievements of current collector technology development. It is a garbage collector for server-side applications with the following features:

  • In some cases, multiple cpus are used to shorten stop-the-world pauses. The G1 collector can still allow Java programs to continue running concurrently, while other collectors have to pause GC actions performed by Java threads.

  • Generational collection Although G1 can manage the entire GC heap independently without the cooperation of other collectors, it can work differently with newly created objects and old objects that have lived through multiple GCS for a while for better collection results.

  • Spatial integration with CMS “tag – cleaning” algorithm is different, the G1 as a whole is based on “tag – sorting algorithm implementation of collector”, from the perspective on the local () between the two Region is based on the “copy” algorithm, but in any case, this means that both algorithms G1 does not produce memory space debris during operation, Collection provides neat free memory. This feature helps programs run for a long time and allocate large objects without triggering the next GC prematurely because contiguity memory space cannot be found.

  • Predictable pauses This is another big advantage of G1 relative to CMS, reduce the pause time is the common concern of G1 and CMS, but G1 besides low pause, also can establish predictable pauses model, can let the user specify in a length of M segment within milliseconds, time on garbage collection may not consume more than N milliseconds, This is almost already a feature of the real-time Java (RTSJ) garbage collector.

While other collectors prior to G1 collected the entire Cenozoic or old age, G1 no longer does. When using the G1 collector, the memory layout of the Java heap is very different from that of the other collectors. It divides the entire Java heap into independent regions of equal size. While the concept of new generation and old generation is retained, the new generation and old generation are no longer physically separated. They are all collections of parts of regions (which do not need to be continuous).

The G1 collector is able to model predictable pause times because it can systematically avoid region-wide garbage collection across the entire Java heap. G1 tracks the value of Garbage accumulation in each Region (the amount of space collected and the empirical value of the collection time), maintains a priority list in the background, and collects the most valuable Region (hence the name garbage-first) based on the allowed collection time. This use of regions and prioritized Region collection ensures that the G1 collector achieves the highest possible collection efficiency in a limited amount of time.

  • The OPERATION of the G1 collector can be roughly divided into the following steps:
    • Initial Marking phase is simply Marking objects that GC Roots can be directly associated with, and modifying the value of TAMS (Next Top at Mark Start) to allow the Next phase of user programs to run concurrently. New objects can be created in the correct available Region. This phase requires the thread to be paused, but it takes a short time.

    • Concurrent Marking the Concurrent Marking phase, which starts with GC Root to analyze the reachability of objects in the heap to identify viable objects, is time-consuming but can be performed concurrently with user programs.

    • The Final Marking phase is used to correct the part of the Marking record that changed during concurrent Marking as the user’s program continued to operate. The virtual machine recorded the changes in the Remembered Set Logs during this time. This phase merges the data of the Remembered Set Logs into the Remembered Set. This phase requires the thread to be paused, but can be executed in parallel.

    • The Live Data Counting and Evacuation recovery phase, which first sorts the recovery value and cost of each Region, can actually be executed concurrently with user programs by developing recovery plans based on expected GC downtime. But because only a portion of regions are reclaimed, the time is user-controlled, and pausing the user thread greatly improves collection efficiency.


Ending

There are so many garbage collector features and pros and cons listed above, without saying which one is the best, but it depends on the business requirements.

Now that you know about GC algorithms and garbage collectors, take a good look at common JVM commands in your next article.