Everything Java back-end development needs to know about JVM performance tuning is here.

preface

With the development of the Internet, Java applications have strict requirements for high concurrency, high availability, fast response, etc., which are actually related to THE JVM. Internet manufacturers have higher and higher requirements for concurrency and performance. Improving Java application performance is becoming increasingly important for JVM performance tuning, which is designed to achieve higher throughput with less memory.

JVM memory tuning

Here is a detailed mind map for JVM performance tuning.

System-level tuning of JVM memory is primarily aimed at reducing the frequency of GC and the number of Full GC.

1.Full GC

The entire heap will be cleaned, including Young, Tenured, and Perm. Full GC is slow because you need to recycle the entire heap, so you should minimize the number of Full GC’s.

2. Causes of Full GC

1)Tenured generations are full

Try to make sure that objects are recycled in the new generation GC, keep objects alive longer in the new generation, and don’t create too large objects and arrays to avoid creating objects directly in the old generation.

2) Persistent Pemanet Generation space is insufficient

Increase the Perm Gen space, avoid too many static objects, control the ratio of new generation and old generation

3)System.gc() is displayed

Garbage collection should not be triggered manually; try to rely on the JVM’s own mechanism

In the process of JVM tuning, a large part of the work is to tune FullGC. The following details the corresponding JVM tuning methods and steps.

Second, JVM performance tuning methods and steps

1. Monitor the GC status

Using various JVM tools, view the current log, analyze the current JVM parameter Settings, and analyze the current heap memory snapshot and GC log to determine whether to optimize based on the actual memory partition for each region and GC execution time.

Here’s an example: Some symptoms before a system crash:

The time of garbage collection is getting longer and longer, extending from 10ms to 50ms. The time of FullGC is also extended from 0.5s to 4 and 5s

The frequency of FullGC is increasing, and the most frequent time interval of FullGC is less than 1 minute

Older generation memory is getting bigger and bigger and no memory is freed after each FullGC

JVM memory snapshot dumps need to be analyzed when the system becomes unable to respond to new requests and reaches the threshold of OutOfMemoryError.

2. Create stacks of dump files

The current Heap information is generated by JMX’s Mbeans as a 3G (the size of the entire Heap) hprof file, which can be generated by Java’s jmap command if JMX is not enabled.

3. Analyze the dump file

Open this 3G heap information file, obviously the general Windows system does not have such a large memory, must use the high configuration Linux, several tools to open the file:

Visual VM

IBM HeapAnalyzer

The Hprof tool comes with the JDK

Mat(Eclipse’s special static memory analysis tool) recommends it

Note: The file is too large, you are advised to use Eclipse’s special static memory analysis tool Mat to open the analysis.

4. Analyze the results and determine whether optimization is needed

If the parameters are set properly, the system does not have timeout logs, the GC frequency is not high, and the GC time is not high, then GC optimization is not necessary. If the GC time is more than 1-3 seconds, or GC is frequent, it must be optimized.

Note: GC is generally not required if the following criteria are met:

Minor GC takes less than 50ms.

Minor GC is performed infrequently, about once every 10 seconds;

Full GC takes less than 1s to execute;

Full GC is performed infrequently, not less than once in 10 minutes;

5. Adjust GC type and memory allocation

If the memory allocation is too small or too large, or if the GC collector is slow, you should prioritize these parameters and beta one or more machines, then compare the performance of the optimized machine with that of the non-optimized machine, and make the final choice accordingly.

6. Constantly analyze and adjust

Through trial and error, analyze and find the most appropriate parameters, and if the most appropriate parameters are found, apply them to all servers.

CMS parameter optimization step process

Let me continue with key JVM parameter configurations (for reference only).

Reference for JVM tuning parameters

-xms-xmx specifies the minimum and maximum values of the JVM heap. In order to prevent the garbage collector from shrinking the heap between the minimum and maximum values, the maximum and minimum values are usually set to the same value.

2. The young generation and the old generation will allocate heap memory according to the default ratio (1:2). You can adjust the size between them by adjusting the ratio NewRadio between them, or for the recycle generation.

For example, for the young generation, run -xx :newSize -xx :MaxNewSize to set its absolute size. Also, to prevent heap shrinkage in the younger generation, we usually set -xx :newSize -xx :MaxNewSize to the same size.

3. How large is a reasonable setting for the young generation and the old generation

1) A larger young generation will inevitably lead to a smaller old generation. A larger young generation will prolong the normal GC cycle but increase the time of each GC; Small aged generations result in more frequent Full GC

2) A smaller young generation will inevitably lead to a larger old generation, and a smaller young generation will result in frequent GC, but shorter GC time each time; Large tenured generations reduce the frequency of Full GC

How you choose should depend on the distribution of the application object lifecycle: if the application has a large number of temporary objects, you should choose the larger young generation; If there are relatively many persistent objects, the aged generation should grow appropriately. But many apps don’t have such obvious features.

The decision should be based on the following two points:

(1) In line with the principle of keeping Full GC to a minimum, let the older generation cache frequently used objects as much as possible. This is also the default ratio of 1:2 for the JVM.

(2) By observing the application for a period of time to see how much memory will be occupied by other aged generations at peak times, the young generation can be enlarged according to the actual situation without affecting the Full GC, for example, the ratio can be controlled at 1:1. But the older generation should be allowed at least a third of the growth.

4. On a well-configured machine (such as multi-core, large memory), you can choose the parallel collection algorithm for the aged generation: -xx :+UseParallelOldGC****.

5. Thread stack setup: each thread opens a stack of 1M by default, which is used to store stack frames, call parameters, local variables, etc. For most applications, this default value is too much, usually 256K is sufficient.

In theory, reducing the stack per thread can produce more threads with constant memory, but this is really limited by the operating system.

That’s the end of the article!

Feel good can point support!