preface

This article is intended as a study note and does not cover the garbage collector

In the Java development process, GC is a very common scenario, if you do not know GC, will feel helpless, so this article to comb through

Garbage collector

A brief review of the garbage collector

Garbage collector classification Role position Using the algorithm The characteristics of Applicable scenario
Serial Serial operation Acting on Cenozoic era Replication algorithm Speed of response priority This mode applies to the client environment with a single CPU
ParNew Run in parallel Acting on Cenozoic era Replication algorithm Speed of response priority It works with the CMS in Server mode with multiple cpus
Parallel Scavenge Run in parallel Acting on Cenozoic era Replication algorithm Throughput priority Ideal for scenarios where background computing does not require much interaction
Serial Old Serial operation It acts on the old Mark-compression algorithm Speed of response priority This mode applies to the client environment with a single CPU
Parallel Old Run in parallel It acts on the old Mark-compression algorithm Throughput priority Ideal for scenarios where background computing does not require much interaction
CMS Run concurrently It acts on the old Mark-clear algorithm Speed of response priority Applicable to Internet or B/S services
G1 Concurrent and parallel running Scope of Cenozoic, old age Mark-compression, copy algorithm Speed of response priority Service-oriented applications

  • With the exception of G1, the wired garbage collector indicates that it can be used together

    • Serial/Serial Old

    • Insane /Parallel Avenge (JDK8 default garbage collector)

    • ParNew/CMS

    • Serial/CMS (JDK9 deprecated)

    • ParNew/Serial Old (JDK9 deprecated)

    • Serial Old is connected because CMS concurrent cleanup occurs at the same time as the user thread. If the user thread is adding objects during concurrent cleanup and no memory is being added, the Serial Old backup will perform a full GC

The GC log

ps+po

Ps + Po is the default garbage collector combination for JDK8, so we’ll start with this pair

// Configure details
// -Xms100m -Xmx100m -XX:+PrintGCDateStamps -XX:+PrintGCDetails 
// -XX:+PrintCommandLineFlags -XX:+UseParallelGC 
public class GcTest {

    public static void main(String[] args) {
        List<byte[]> list = new ArrayList<>();

        while(true) {
            // 100K
            byte[] bytes = new byte[1024 * 100]; list.add(bytes); }}}Copy the code

Execute the code and you’ll soon be printing out the GC log

Ps + Po Minor Parses GC logs
The log parsing
The 2021-05-09 T15:38:45. 580-0800 Log printing time, -xx :+PrintGCDateStamps: The log can be printed after this configuration is added
GC (Allocation Failure) This is a Minor garbage collection, which does not distinguish between the new generation and the old generation. The reason for this Allocation Failure is that there is not enough area in the new generation to store the data that needs to be allocated
[PSYoungGen: 2536K->496K(2560K)] PSYoungGen: Indicates the region in which GC occurs, and the region name is closely related to the GC collector in use

Serial collector: Default New Generation Displays DefNew

ParNew collector: ParNew

Parallel Scanvenge collector: PSYoung
2536K->496K(2560K) : Used capacity of the memory region before GC -> Capacity of the memory region after GC (total capacity of the region) The total capacity will show 9/10 of the entire memory of the new generation, which is Eden + FROM/TO area, because only one area will store data at the same time
4723K->4717K(9728K) After the region capacity GC is displayed, the entire heap memory region GC is displayed: used heap memory before GC -> total heap memory capacity
0.0013088 secs The time taken for the entire GC, in seconds
[Times: user=0.00 sys=0.00, real=0.00 secs] User: The time taken by the process to execute user-mode code (outside the core). This is the actual CPU time used to execute this process, excluding other processes and the time this process blocks. In the case of garbage collection, represents the total CPU time used by the GC thread to execute

Sys: CPU time consumed by a process in kernel mode, that is, the CPU time spent executing system calls or waiting for system events in the kernel. Real: clock time used by a program from start to finish. This time includes the time slice used by other processes and the time the process is blocked (such as waiting for I/O to complete). For parallel GC, this number should be close to (user time + system time)/number of threads used by the garbage collector
Ps + Po Full Parses GC logs

The full GC logs are similar to the minor GC logs

The log parsing
Full GC (Ergonomics) A garbage collection has occurred, and this is a Full GC, which does not distinguish between the new generation GC and the old generation GC

The contents in parentheses are the causes of GC, where Ergonomics is gc due to adaptive tuning by the JVM

Full GC (System) : The system.gc () method is called

Metadata GC Threshold: The Metaspace area is insufficient
public class GcTest {

    public static void main(String[] args) {
        List<byte[]> list = new ArrayList<>();
        while(true) {
            // 10K
            byte[] bytes = new byte[1024 * 100];
            list.add(bytes);
          	// Manually trigger gcSystem.gc(); }}}Copy the code

You can see the gc causes start to change

The cglib dynamic proxy is constantly generating classes causing a metaspace overflow

public static void main(String[] args) {

    while (true) {
        Enhancer enhancer = new Enhancer();
        enhancer.setCallback(new CglibInterceptor());
        // Without caching, new proxy objects are constantly generated
        enhancer.setUseCache(false); enhancer.setSuperclass(CglibBean.class); CglibBean cglibBean = (CglibBean) enhancer.create(); cglibBean.eat(); }}Copy the code

You can see the gc cause change

The reasons for the OOM have changed, too

CMS

The code to emulate the CMS GC is the same, except that the garbage collector has been changed to CMS

// -Xms100m -Xmx100m -XX:+PrintGCDateStamps -XX:+PrintGCDetails 
// -XX:+PrintCommandLineFlags -XX:+UseConcMarkSweepGC
public static void main(String[] args) {
    List<byte[]> list = new ArrayList<>();

    while(true) {
        // 10K
        byte[] bytes = new byte[1024 * 100]; list.add(bytes); }}Copy the code

The whole PROCESS of CMS mainly consists of four stages:

  • Initial Flag (STW)
  • Concurrent tags
  • Relabeling (STW)
  • Concurrent cleaning

Looking at the CMS GC logs, the whole process can be broken down into seven phases

  • CMS Initial Mark:
  • CMS-concurrent-mark
  • CMS-concurrent-preclean
  • CMS-concurrent-abortable-preclean
  • CMS Final Remark
  • CMS-concurrent-sweep
  • CMS-concurrent-reset

For the several stages of their respective role, this article who spoke well www.jianshu.com/p/2a1b2f17d…

G1

Before GC logging, let’s give you an overview of the G1 garbage collector

The G1 garbage collector uses the new idea of partitioned collection

In the G1 collector, the entire Java heap is divided into about 2048 independent Region blocks of the same size. The size of each Region block depends on the actual size of the heap. The whole Region block is controlled between 1MB and 32MB, and the power is 2 to N, that is, 1MB, 2MB, 4MB, 8MB, 16MB, 32MB. -xx :G1HeapRegionSize Can be set. All regions are the same size and do not change during the lifetime of the JVM

As shown in the figure, a Region can be assigned to Eden, survivor, Old, and Humongous. Humongous is used to store objects 50% or larger than 50% of the size of a standard Region as a group of contiguous regions

Rset

Regions cannot be independent from each other. It is normal for objects in one Region to be referenced by objects in other regions. If you want to know whether objects in other regions are referenced by objects in other regions, you need to scan the entire heap. So G1 uses rsets to record references between objects

In the following figure, each Region has an Rset. Objects in Region1 and Region3 reference objects in Region2, and the reference is recorded in the Rset of Region2

Cset

The target partition to reclaim on each GC

GC recovery process

  1. When Eden area memory is insufficient, the young generation GC will be triggered. The young generation collection stage of G1 is a parallel and exclusive collector. During GC, all application threads will be suspended and multithreading will be started to perform the young generation collection
  2. When heap memory reaches a certain value, the old-age concurrent marking phase begins
  3. The mixed collection process starts as soon as the tag is completed. G1’s old era collector does not need to collect the whole old era. It only needs to scan a small number of old regions at a time
  4. If that doesn’t work, G1 STW, using a single-threaded memory reclamation algorithm for garbage collection

The following figure shows the object replication process during garbage collection

G1GC log

As for the meaning of these logs, the big man’s translation is goodwww.jianshu.com/p/ab37844d0…

GC easy

There are so many GC logs that it is impossible to analyze them manually. Gceasy is a better gc log analysis tool

Since you need enough GC logs to see the effect, I pulled the GC log file from the test environment. Here is the environment configuration for this container

Analyze GC logs using GC Easy

  • Allocated Space Young Generation, Old Generation, Meta Space, Allocated Space, Perk is the peak
  • -xmn725m: The young generation should consume 241.6m of JVM heap memory. This shows 217.81 memory usage, because only one area can store data in survivor 0 and survivor 1 at the same time, so the young generation occupies 725/3*0.9=217.5 memory usage
  • Older generations take up 2/3 of JVM heap memory and 1.08 GB of machine memory for meta-space

  • Throughput: Throughput is 99.981%
  • Latency indicates Latency. Avg Pause GC Time indicates that the average GC duration is 10.7ms, and Max Pause GC Time indicates that the maximum GC duration is 530ms
  • GC Pause Duration Time Range Indicates the Duration of GC. The Range from 0 to 100 accounts for 99.97%
Interactive Graphs The meaning of each graph
  • Heap after GC: Heap usage after GC
  • Heap before GC: Heap usage before GC
  • GC Duration: GC Duration
  • Reclaimed Bytes: The memory size of the Reclaimed garbage object
  • Young Gen: Young generation heap usage
  • Old Gen: The usage of the Old generation heap
  • Meta Space: Usage of the Meta Space
  • A & P: Heap allocation and promotion at each GC. The red line shows how much memory (objects) in the young generation are promoted to the old generation each time GC is performed

In particular, the GC duration refers to the amount of time spent in the pause phase. The GC duration includes multiple processes at a time, such as CMS containing initial tags, concurrent tags, etc

The total amount of memory collected by Minor GC and Full GC, the total amount of time spent by Minor GC and Full GC, and the average time spent by Minor GC and Full GC are collected from left to right

The resources

Blog.csdn.net/qq_40093255…

Silicon Valley JVM video