GC algorithmGarbage collectiondevice

An overview of the

Garbage Collection, often referred to as “GC,” was created in 1960 in MIT’s Lisp language and has matured over half a century.

In the JVM, program counters, virtual machine stacks, and local method stacks are created and removed with the thread. Stack frames are loaded and unloaded as methods enter and exit, realizing automatic memory cleaning. Therefore, our memory garbage collection is mainly concentrated in the Java heap and method area. This portion of memory is allocated and used dynamically.

 

Object storageLive judgment

There are two ways to determine whether an object is alive or not:

Reference count: Each object has a reference count property. When a new reference is added, the count is increased by 1. When a reference is released, the count is decreased by 1. This method is simple and does not solve the problem of objects referring to each other circularly.

Reachability Analysis: Start with GC Roots and search down the path called the reference chain. When an object is not connected to GC Roots by any reference chain, the object is proved to be unavailable. Unreachable objects.

In the Java language, GC Roots includes:

Object referenced in the virtual machine stack.

The object referenced by the class static attribute entity in the method area.

The object referenced by the constant in the method area.

Objects referenced by JNI in the local method stack.

 

Garbage collection algorithm

Mark-clear algorithm

The Mark-sweep algorithm, as its name suggests, is divided into two phases: “Mark” and “Sweep.” It flags all objects that need to be reclaimed, and when it’s done, it dumps all of the flagged objects. The reason why it is the most basic collection algorithm is that subsequent collection algorithms are based on this idea and improve its shortcomings.

Its main disadvantages are two: one is the efficiency problem, the efficiency of the marking and cleaning process is not high; Another problem is the space problem. After the mark is cleared, a large number of discrete memory fragments will be generated. Too many space fragments may cause the program to be unable to find enough contiguous memory when it needs to allocate large objects in the future and have to trigger another garbage collection action in advance.

 

Replication algorithm

A collection algorithm for “Copying” that divides available memory into two equivalent pieces by capacity and uses only one piece at a time. When this area of memory is used up, the surviving objects are copied to the other area, and the used memory space is cleaned up again.

In this way, each time a piece of memory is reclaimed, memory allocation does not have to consider the complexity of memory fragmentation, as long as the heap top pointer is moved, in order to allocate memory, simple implementation, efficient operation. However, the cost of this algorithm is to reduce memory by half, and the efficiency of continuously copying long-lived objects is reduced.

 

Mark –The compressionalgorithm

The copy-collection algorithm will perform more replication operations when the object survival rate is high, and the efficiency will be low. More importantly, if you do not want to waste 50% of the space, you need to have extra space for allocation guarantee, in the extreme case that all objects in the memory used are 100% alive, so in the old days, this algorithm generally cannot be used directly.

According to the characteristics of the old s, someone put forward another “tag – finishing” (Mark – Compact) algorithm, the labeling still like “tag – clear” algorithm, but the subsequent steps are not directly to recyclable objects to clean up, but let all live objects move to the end, and then clear directly outside the boundary of the memory

 

Generational collection algorithm

The basic assumption of GC generation is that most objects have very short lifetimes.

A “Generational Collection” algorithm divides the Java heap into Generational and older generations so that the most appropriate Collection algorithm can be used for each generation. In the new generation, a large number of objects are found dead and only a few survive in garbage collection, so the replication algorithm is selected, and only a small amount of the replication cost of the surviving objects can be collected. In the old days, because the object has a high survival rate and there is no extra space to allocate it, it has to use the “mark-clean” or “mark-tidy” algorithm for recycling.

 

Garbage collector

If the collection algorithm is the methodology of memory collection, the garbage collector is the concrete implementation of memory collection

 

Serial collector

Serial collectors are the oldest, most stable, and efficient collectors, and may produce long pauses and only use one thread to collect. The new generation and the old use serial recycling; New generation replication algorithm, old age marker – compression; – Garbage collection stops The World

Parameter control: -xx :+UseSerialGC serial collector

 

 

 

ParNew collector

The ParNew collector is essentially a multithreaded version of the Serial collector. New generation parallel, old age serial; New generation replication algorithm, old age marking – compression

Parameter control: -xx :+UseParNewGC ParNew collector

-xx :ParallelGCThreads Limits the number of threads

 

The Parallel collector

The Parallel Collector is similar to the ParNew collector in that the Parallel collector focuses on the throughput of the system. You can set parameters to enable the adaptive adjustment policy. The vM collects performance monitoring information based on the current system running status and dynamically adjusts these parameters to provide the most appropriate pause time or maximum throughput. You can also use parameters to control how many milliseconds or percentages GC takes; New generation replication algorithm, old age marking – compression

Parameter control: -xx :+UseParallelGC using Parallel collector + old serial

         

Parallel Old The collector

The Parallel Old is an older version of the Parallel Avenge collector that uses multithreading and a “mark-and-collate” algorithm. This collector was only available in JDK 1.6

Parameter control: -xx :+UseParallelOldGC Using Parallel collector + old age parallelism

 

CMSThe collector

The CMS (Concurrent Mark Sweep) collector is a collector whose goal is to obtain the shortest collection pause time. At present, a large part of Java applications are concentrated on the server side of Internet sites or B/S systems. These applications pay special attention to the response speed of services and hope that the system pause time is the shortest, so as to bring users a better experience.

As the name (which includes “Mark Sweep”) suggests, the CMS collector is based on a “mark-sweep” algorithm, which is more complex than the previous collectors. The process is divided into four steps, including:

CMS Initial Mark

CMS Concurrent Mark

Re-marking (CMS Remark)

CMS Concurrent sweep

The initial marking and re-marking steps still need to “Stop The World”. Initial marking only marks the objects directly associated with GC Roots, which is very fast. Concurrent marking is the process of GC Roots Tracing, while re-marking is to revise the marking records of those objects whose marks are changed due to the continuous operation of user programs during concurrent marking. The pause time in this phase is generally slightly longer than in the initial tagging phase, but much shorter than in concurrent tagging. Since the collector thread can work with the user thread during the longest concurrent markup and concurrent cleanup, the CMS collector’s memory reclamation process is generally executed concurrently with the user thread. Old collector (New generation using ParNew)

Advantages: Concurrent collection, low pauses

Disadvantages: large amount of space debris, concurrent phases reduce throughput

Parameter controls: -xx :+UseConcMarkSweepGC uses the CMS collector

– XX: + UseCMSCompactAtFullCollection Full GC, after a defragmentation; The collation process is exclusive and causes pauses to become longer

– XX: + CMSFullGCsBeforeCompaction set for a few times after Full GC, a defragmentation

-xx :ParallelCMSThreads Sets the number of threads in your CMS (usually approximately equal to the number of available cpus).

         

G1The collector

G1 is one of the latest developments in technology and the HotSpot development team has assigned it the mission of replacing the CMS collector released in JDK1.5 in the future. Compared to the CMS collector, the G1 collector has the following features:

1. Space integration. G1 collector adopts the mark collation algorithm and will not generate memory space fragmentation. Allocating large objects does not trigger the next GC prematurely because contiguous space cannot be found.

2. Predictable pauses, this is another big advantage of G1, reduce the pause time is the common concern of G1 and CMS, but G1 in addition to the pursuit of low pause, also can establish predictable pauses model, can let the user specify in a length of N segment within milliseconds, time on garbage collection may not consume more than N milliseconds, This is almost already a feature of the real-time Java (RTSJ) garbage collector.

The garbage collector mentioned above collects the entire Cenozoic or old generation, which is no longer the case with G1. When using the G1 collector, the memory layout of the Java heap is very different from that of the other collectors. It divides the entire Java heap into independent regions of equal size. While the concept of new generation and old generation is retained, the new generation and old generation are no longer physically separated. They are all collections of partial (possibly discontinuous) regions.

 

G1’s Cenozoic collection is similar to ParNew’s. When the Cenozoic occupation reaches a certain proportion, the collection starts. Like CMS, the G1 collector pauses briefly to collect old objects.

 

Collection steps:

1. The marking phase, initial-mark, stops the World Event and triggers a normal Mintor GC. Corresponding GC log:GC pause (young) (Inital -mark)

2, Root Region Scanning, recovery of survivor zone (live to old age) during program running must be completed before young GC.

Concurrent Marking, Concurrent Marking (and application execution) throughout the heap, which may be interrupted by the Young GC. During the concurrent marking phase, if all objects in a region object are found to be garbage, the region is immediately reclaimed (X in figure). At the same time, the object activity (the percentage of living objects in the region) of each region is calculated during concurrent tagging.

4, Remark, then mark, there will be a short pause (STW). The re-mark phase is used to collect the concurrent mark phase and generate new garbage (the concurrent phase runs with the application); G1 uses a faster initial snapshot algorithm than CMS :snapshot-at-the-beginning (SATB).

5, Copy/Clean up, multithreading to Clean the inactivated object, there will be STW. G1 copies the live objects of the reclaimed region to the new region, clears the Remember Sets, and simultaneously clears the reclaimed region and returns it to the free region linked list.

6. After copying/cleaning process. The active objects in the recovery area have been concentrated in the dark blue and dark green areas.

 

The commonly usedThe collectorcombination

  New Generation GC strategy Generation GC policy instructions
Mix 1 Serial Serial Old Serial and Serial Old are single-thread GC and are characterized by suspending all application threads during GC.
Combination of two Serial CMS+Serial Old CMS (Concurrent Mark Sweep) is a Concurrent GC that enables GC threads and application threads to work concurrently without suspending all application threads. In addition, when the CMS fails to perform GC, the Serial Old policy is automatically used for GC.
Group 3 ParNew CMS Use the -xx :+UseParNewGC option to enable it. ParNew is a parallel version of Serial, and you can specify the number of GC threads, which by default is the number of cpus. You can specify the number of GC threads using the -xx :ParallelGCThreads option. If the -xx :+UseConcMarkSweepGC option is specified, the new generation uses the ParNew GC policy by default.
Combination of four ParNew Serial Old Use the -xx :+UseParNewGC option to enable it. The new generation uses the ParNew GC policy, and the Old generation uses the Serial Old GC policy by default.
Combination of five Parallel Scavenge Serial Old The Parallel Insane strategy is primarily focused on a controlled throughput: application run time/(application run time + GC time) so that CPU utilization is as high as possible and is appropriate for persistent applications in the background and not for applications with a lot of interaction.
Combination of 6 Parallel Scavenge Parallel Old Parallel Old is a Parallel version of Serial Old

 

Combination of 7 G1GC G1GC + UnlockExperimentalVMOptions – – XX: XX: + UseG1GC # open

-xx :MaxGCPauseMillis =50 # Pause time target

-xx :GCPauseIntervalMillis =200 # Pause interval target

-xx :+G1YoungGenSize= 512M # Young generation size

-xx :SurvivorRatio=6 # SurvivorRatio

 

reference

my.oschina.net/hosee/blog/644618

In-depth Understanding of the Java Virtual Machine: Advanced JVM Features and Best Practices PDF

Download address: download.csdn.net/detail/ityo…

The copyright of this article belongs to the author and blog park, welcome to reprint, but without the consent of the author must retain this paragraph of statement, and give the original link in the obvious position of the article page, otherwise reserve the right to pursue legal responsibility. If this article is helpful to you, please help more [recommended] under this article. If you like my article, please follow my official account