Garbage collection algorithm

Mark-sweep algorithm

The algorithm is divided into “mark” and “clear” two stages: first, mark all the objects to be recycled, after the completion of mark all the marked objects.

Disadvantages: 1. Low efficiency in marking and cleaning; 2. Large amount of discontinuous memory fragments will be generated after cleaning

Mark-collation algorithm

Tag finishing on the basis of the tag removal made improvement, the first phase of mark phase, with clear marking, mark all need recycled object, but the second phase the finishing stage is: after completion of the tag is not recycled directly on the object to clean up, but let all live objects move to the end, in the process of mobile clear recyclable objects.

The advantage over mark clearing is that memory defragmentation does not result in a large number of discrete memory fragments

The efficiency of mark-collation algorithm can be greatly improved when the object survival rate is high.

Replication algorithm

An algorithm was proposed to solve the memory fragmentation defect of Mark-Sweep algorithm. Memory is divided into equal sizes based on memory capacity

The two sides. Use only one block at a time, and when the block is full, copy the surviving objects to the other block, clearing the used memory

Advantages: Simple implementation, high memory efficiency, not easy to generate space debris

Disadvantages: Obviously, it is necessary to reserve half of the available memory for the replication object, resulting in many replication operations when the survival rate of the object is high and the efficiency becomes low

Generational collection algorithm

Generational collection is the approach used by most JVMS today, and the core idea is to allocate memory based on the different life cycles of objects

The GC heap is typically divided into Tenured/Old Generation and Young Generation. The characteristics of the old generation are that only a small number of objects need to be recycled in every garbage collection, while the characteristics of the new generation are that a large number of garbage needs to be recycled in every garbage collection, so different algorithms can be selected according to different regions.

New generation of replication algorithms

The default Eden/S0/S1 space ratio is 8:1:1 for Eden:S0:S1. The available memory (the memory that can be allocated to new objects [Eden and Survivor1]) is 90% of the total memory. If the ratio of Eden/S0/S1 is 8:1:1, the surviving objects in Eden+S0 may exceed 10% of the total space (S1 and S0 are both 10% of the total space). In this case, the Cenozoic GC will directly put the objects with long life cycle into the old generation. No need to reach the threshold we set (the survival times of the old generation, -xx :MaxTenuringThreshold)

The algorithm process

Eden+S0 Can allocate new objects; Garbage collection is performed for Eden+S0 and live objects are copied to S1. Clean up the Eden + S0. A Cenozoic GC ends. Eden+S1 can allocate new objects; Garbage collection is performed for Eden+S1 and live objects are copied to S0. Clean up the Eden + S1. The second Cenozoic GC is over. Loop 1.Copy the code

HotSpot implements the replication algorithm:

1. When Eden is full, the first Minor GC is triggered to copy surviving objects to Survivor From; 2. When the Eden area triggers the Minor GC again, the Eden and From areas will be scanned and garbage collected for the two areas. The surviving objects will be directly assigned To the To area (if the remaining space in the To area is insufficient To copy the original surviving objects, memory guarantee will be made through the old age. Copy these objects directly to the old age without waiting for MaxTenuringThreshold, and clear Eden and From areas. 3. When a Minor GC occurs in Eden, the Eden and To regions will be garbage collected, the surviving objects will be copied To the From region, and Eden and To regions will be emptied. 4. Some objects will be copied back and forth between the From and To sections, and this will be done 15 times (determined by the JVM parameter MaxTenuringThreshold, which defaults To 15).Copy the code

Stop-the-world

When stop-the-world occurs, all threads except those required for GC are in a waiting state until the GC task completes. GC optimization is often about reducing the amount of time stop-the-world takes place. MinGC\MajorGC is stop-the-world, so why MajorGC takes longer? Because OldGen contains a large inventory of objects.

MinorGC

Collecting memory from a Young generation space (mainly Eden) is called a Minor GC. Minor GC of the Cenozoic generation (mainly Eden) is common and supported by all collectors.

Trigger condition: When Eden field is full, MinorGC is triggered. That is, when applying for an object, the Eden area is found to be insufficient, then MinorGC will be triggered once. The Minor GC is triggered when the JVM is unable to allocate space for a new object, such as when the Eden space is full. So the higher the allocation rate, the more frequently Minor GC is performed. Survivor full does not trigger the Minor GC

The New generation is divided into three regions: Eden Space, from Space and to Space. The default ratio is 8:1:1. In MinorGC, surviving objects are copied to the to Space area, and if the to space area is insufficient, the old age area is accessed using a guarantee mechanism.

All Minor GCS trigger “stop-the-world,” stopping the application’s threads. For most applications, the delay caused by pauses is negligible. The truth is that most objects in Eden can be considered garbage and will never be copied into Survivor zones or old chronospaces. If the opposite were true and most of the new objects in Eden were not eligible for GC, the pauses in Minor GC would be much longer.

When Minor GC is performed, the permanent generation is not affected. References from the permanent generation to the young generation are treated as GC roots, and references from the young generation to the permanent generation are ignored directly in the marking phase.

MajorGC

There will usually be at least one Minor GC. Currently, only the CMS collector has the behavior of collecting ages separately.

FullGC

FullGC is not equal to Major GC, nor is it equal to Minor GC + Major GC. Depends on what combination of garbage collectors

Triggering conditions:

1. When the old generation is Full, Full GC will be triggered. Full GC will recycle the new generation and the old generation at the same time. Full GC is also raised when the generation is permanently Full, which causes the unloading of Class and Method meta information. When calling system. gc, it is recommended to execute Full GC. After passing the Minor GC, the Space entered in the old age is larger than the available memory in the old age. 5. If the size of an object is greater than the available memory of To Space, the object is migrated To the old age and the available memory of the old age is smaller than the size of the objectCopy the code

Note that most of the time, the Major GC is confused with the Full GC, and you need to be specific about whether it is an old collector or a whole heap collector.

Mixed GC

Garbage collection that aims to collect the entire new generation and part of the old generation. Currently, only the G1 collector has this behavior

Conditions for class unload

All instances of the class have been reclaimed. 2. The ClassLoder that loaded the class has been reclaimed. None of the java.lang.Class objects corresponding to this Class are referencedCopy the code

Garbage collector:

New generation garbage collector

Serial

Single-thread STW reclamation (suspends all user threads) Replication algorithm for the new generation of VMS in Client modeCopy the code

ParNew

Multithreaded STW recycle (suspend all user threads) Server mode of the new generation of virtual machine replication algorithmCopy the code

Parallel Scavenge

Multithreaded STW collection (pause all user threads) The goal of the new generation of virtual machine replication algorithm in Server mode is to achieve a controllable throughput (throughput = time to run user code /(time to run user code + time to collect garbage) Adaptive adjustment strategy (automatic collection of performance monitoring information based on system operation, dynamic adjustment (size of new generation, ratio of Eden to Survivor, age promotion of old age objects, etc.) to provide appropriate pause time or maximum throughput)Copy the code

Old age garbage collector

Serial Old

Single-thread STW collection (suspends all user threads) Collates the ages of virtual machines in Client modeCopy the code

Parallel Old

The Parallel Exploder is a version of the Parallel Exploder, the Multi-threaded STW recycle (suspend all user threads) Server mode virtual machine aging tag collationCopy the code

CMS

Concurrent Mark Sweep multithreading old Mark Sweep CUP sensitive Default start collector thread count = (CPU +3)/4 Unable to Sweep floating garbage 1, initial Mark STW only marks GC Roots can be directly associated with the object, fast 2, concurrent marking user thread and GC thread parallel execution 3, re-marking STW 4, concurrent cleaning user thread and GC thread parallel executionCopy the code

G1

Initial Marking simply marks objects to which GC Roots can be directly associated, and modifs the value of the TAMS pointer so that the next stage of user threads running concurrently allocates new objects in the available regions correctly. This phase requires the thread to pause, but it is very short, and it is done synchronously during the Minor GC, so the G1 collector actually has no additional pauses during this phase. The reachability analysis of objects in the heap is performed starting from GC Root, and the object graph in the heap is recursively scanned to find the objects to be reclaimed. This phase is time-consuming, but can be performed concurrently with user programs. When the object graph scan is complete, it is also necessary to reprocess objects recorded by SATB that have reference changes at the time of concurrency. · 3. Final Marking: Another short pause is made for the user thread to process the last few SATB records that remain after the end of the concurrent phase. 4. Live Data Counting and Evacuation is responsible for updating statistics of regions, sequencing the recovery value and cost of each Region, and developing recovery plans based on the expected downtime of users. You can select any number of regions to form a collection, copy the surviving objects of the Region to the empty Region, and clear all the space of the old Region. The operation here involves the movement of the living object, which must suspend the user thread, and is done in parallel by multiple collector threads.Copy the code