1. Tuning principles

JVM tuning sounds fancy, but realize that it should be the last bullet in Java performance tuning.

I agree with Professor Liao xuefeng’s point of view. We should realize that JVM tuning is not a conventional means. Generally, the optimization program is the first choice for performance problems, and the last choice is to conduct JVM tuning.

The JVM’s automatic memory management was designed to pull developers out of the memory management mire. Even if you do have to tune the JVM, you should never do it on the fly, but always monitor and analyze performance data in detail.

2. Timing of JVM tuning

What are the situations in which YOU have to consider JVM tuning?

  • Heap memory continues to rise to the maximum memory value set.
  • Full GC times are frequent;
  • [Fixed] GC pauses are too long (more than 1 second)
  • Memory exceptions such as OutOfMemory occur in applications.
  • Some applications use local cache and take up a lot of memory space.
  • The system throughput and response performance are low or degraded.

3. JVM tuning objectives

Throughput, latency, and memory footprint are similar to CAP, constituting an impossible triangle. Only two of them can be selected for tuning, not all three.

  • Latency: low GC pauses and low GC frequencies;
  • Low memory usage;
  • High throughput;

If you choose two of them, you have to sacrifice the other.

Here are some reference examples of quantization goals for JVM tuning:

  • Heap memory usage <= 70%;
  • Old generation memory usage <= 70%;
  • Avgpause <= 1 second;
  • Full Gc count 0 or AVG pause Interval >= 24 hours;

Note: The quantification goals for JVM tuning vary from application to application.

Steps for JVM tuning

Typically, JVM tuning can be done through the following steps:

  • Analyze system running status: analyze GC logs and dump files to determine whether optimization is needed and identify bottlenecks;
  • Determine JVM tuning quantification goals;
  • Determine JVM tuning parameters (adjusted based on historical JVM parameters);
  • Optimize memory, latency, throughput and other indicators in turn;
  • Compare and observe the difference before and after tuning;
  • Continue analyzing and tuning until you find the right JVM parameter configuration;
  • Find the most appropriate parameters, apply them to all servers, and follow up.

Some of the above steps require multiple iterations. Generally, it starts from meeting the memory usage requirements of the program, then it is the requirement of time delay, and finally it is the requirement of throughput. It is based on this step to continuously optimize, and each step is the basis for the next step, which cannot be retrograde.

5. JVM parameters

Let’s look at the JVM parameters for the JDK.

5.1 Basic parameters

The parameter name meaning The default value
-Xms Initial heap size 1/64 of the memory By default (the MinHeapFreeRatio parameter can be adjusted), when the free heap is less than 40%, the JVM increases the heap to the maximum limit of -xmx.
-Xmx Maximum heap size The memory of a quarter By default (the MaxHeapFreeRatio parameter can be adjusted) when free heap memory is greater than 70%, the JVM reduces the heap to the minimum limit of -xms
-Xmn Young generation size Pay attention to: The size here is (Eden + 2 survivor space). This is different from the New Gen shown in Jmap-heap. Total heap size = young generation size + Old generation size + persistent generation size. When you increase the young generation, you decrease the size of the old generation. This value has a significant impact on system performance. Sun officially recommends setting it to 3/8 of the entire heap
-XX:NewSize Set the young generation size
-XX:MaxNewSize Maximum of young generation
-XX:PermSize Set the initial perm gen value 1/64 of the memory JDK1.8 before
-XX:MaxPermSize Set the maximum number of persistent generations The memory of a quarter JDK1.8 before
-Xss Stack size per thread After JDK5.0, the stack size of each thread is 1M. Before JDK5.0, the stack size of each thread is 256K. The size of memory required by more applied threads is adjusted. Reducing this value generates more threads for the same physical memory. However, the operating system has a limit on the number of threads in a process, which can not be generated indefinitely. The experience value is about 3000~5000. Generally, small applications, if the stack is not very deep, should be 128K enough. This option has a significant impact on performance and requires rigorous testing. (Principal) The interpretation is very similar to that of Threadstacksize. The official document does not seem to explain this. There is a sentence in the forum :” -Xss is translated in a VM flag named Threadstacksize “. Generally, this value will do.
XX:ThreadStackSize Thread Stack Size (0 means default stack size) [Sparc: 512; Solaris x86: 320 (was 256 prior in 5.0 and earlier); Sparc 64 bit: 1024; Linux AMD64:1024 (was 0 in 5.0 and earlier); All others 0.
-XX:NewRatio Ratio of young generation (including Eden and two Survivor zones) to old generation (excluding persistent generation) -xx :NewRatio=4 Indicates that the ratio of the young generation to the old generation is 1:4. The young generation occupies 1/5 of the entire stack. If Xms=Xmx and Xmn is set, this parameter does not need to be set.
-XX:SurvivorRatio Size ratio of Eden zone to Survivor zone If set to 8, the ratio of two Survivor zones to one Eden zone is 2:8, and one Survivor zone accounts for 1/10 of the whole young generation
-XX:LargePageSizeInBytes Do not set the size of the memory page to too large, which will affect the Perm size =128m
-XX:+UseFastAccessorMethods Quick optimization of primitive types
-XX:+DisableExplicitGC Closure System. The gc () This parameter requires rigorous testing
-XX:+ExplicitGCInvokesConcurrent Closure System. The gc () disabled Enables invoking of concurrent GC by using the System.gc() request. This option is disabled by default and can be enabled only together with the -XX:+UseConcMarkSweepGC option.
-XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses Closure System. The gc () disabled Enables invoking of concurrent GC by using the System.gc() request and unloading of classes during the concurrent GC cycle. This option is disabled by default and can be enabled only together with the -XX:+UseConcMarkSweepGC option.
-XX:MaxTenuringThreshold Maximum age of garbage If set to 0, the young generation object passes through the Survivor zone and goes directly to the old generation. For the older generation of more applications, can improve efficiency. If set to a large value, the young generation object will be copied in Survivor zones many times, increasing the lifetime of the object in the young generation and increasing the probability that it will be collected in the young generation. This parameter is only valid for serial GC.
-XX:+AggressiveOpts To speed up compilation
-XX:+UseBiasedLocking Locking mechanism performance improvements
-Xnoclassgc Disable garbage collection
-XX:SoftRefLRUPolicyMSPerMB Lifetime of a SoftReference in the free space of each MB heap 1s softly reachable objects will remain alive for some amount of time after the last time they were referenced. The default value is one second of lifetime per free megabyte in the heap
-XX:PretenureSizeThreshold Objects over size are allocated directly in the generation 0 The Parallel Avenge Avenge another situation that is allocated directly in the older generation is a large array object that has no external reference objects in it.
-XX:TLABWasteTargetPercent Percentage of TLAB in Eden District 1%
-XX:+CollectGen0First Whether YGC precedes FullGC false

Main parameters of Jdk7 version

The parameter name meaning The default value
-XX:PermSize Setting persistent Generation Jdk7 or earlier versions
-XX:MaxPermSize Set the maximum persistent generation Jdk7 or earlier versions

Important Jdk8 version-specific parameters

The parameter name meaning The default value
-XX:MetaspaceSize Meta-space size Jdk8 version
-XX:MaxMetaspaceSize Maximum metaspace Jdk8 version

5.2 Parameters related to the parallel collector

The parameter name meaning The default value
-XX:+UseParallelGC Parallel MSC for Full GC Select the garbage collector as a parallel collector. This configuration is valid only for the young generation. That is, under the above configuration, the young generation uses concurrent collection, while the old generation still uses serial collection.
-XX:+UseParNewGC Set concurrent collection for the young user You can use JDK5.0 or above in conjunction with the CMS collection, and the JVM will set it itself based on system configuration, so you do not need to set this value
-XX:ParallelGCThreads Number of threads for the parallel collector This value is best configured to equal the number of processors and also applies to CMS
-XX:+UseParallelOldGC Parallel Compacting garbage Collection for the elderly generation This is the parameter option that appears in JAVA 6
-XX:MaxGCPauseMillis Maximum time (maximum pause time) for each young generation garbage collection If this time cannot be met, the JVM automatically resizes the young generation to meet this value.
-XX:+UseAdaptiveSizePolicy Automatically selects the young zone size and the corresponding Survivor zone ratio When this option is set, the parallel collector automatically selects the size of the young generation region and the corresponding Survivor region ratio to achieve the minimum corresponding time or collection frequency specified by the target system. This value is recommended to keep the parallel collector open when using it.
-XX:GCTimeRatio Set the garbage collection time as a percentage of program run time Formula for 1 / (1 + n)
-XX:+ScavengeBeforeFullGC YGC is called before Full GC true Do young generation GC prior to a full GC.

5.3 CMS related Parameters

The parameter name meaning The default value
-XX:+UseConcMarkSweepGC Use CMS memory collection After this was configured in the test, the -xx :NewRatio=4 configuration failed for unknown reasons. Therefore, it is best to set the size of the young generation with -xmn.
-XX:+AggressiveHeap Attempts to optimize large memory usage for a long time using a large amount of physical memory, can check the computing resources (memory, number of processors) required at least 256MB of memory with a large amount of CPU/memory, (as shown in 1.4.1 on 4CPU machines)
-XX:CMSFullGCsBeforeCompaction How many times after memory compression Because the concurrent collector does not compress or defragment the memory space, it can become “fragmented” after running for a while, making it less efficient. This value sets the number of GC runs after the memory space is compressed and collated.
-XX:+CMSParallelRemarkEnabled Drop mark pause
-XX+UseCMSCompactAtFullCollection Compression of the tenured generation during FULL GC CMS does not move memory, so it is very easy to fragment and run out of memory, so memory compression is enabled at this point. Adding this parameter is a good habit. Performance may be affected, but fragmentation can be eliminated
-XX:+UseCMSInitiatingOccupancyOnly Start CMS collection using manual definition initialization definitions Disable HostSpot from triggering the CMS GC
-XX:CMSInitiatingOccupancyFraction=70 Start CMS collection after using 70% of CMS as garbage collection 92 To ensure that promotion failed(described below) errors do not occur, this value needs to be set according to the following formulaThe formula CMSInitiatingOccupancyFraction
-XX:CMSInitiatingPermOccupancyFraction Set Perm Gen to use when triggering at what rate 92
-XX:+CMSIncrementalMode Set to incremental mode Used for single CPU
-XX:+CMSClassUnloadingEnabled

5.4 Auxiliary information

The parameter name meaning The default value
-XX:+PrintGC [GC 121376K->10414K, 0.0650971 SECs] [Full GC 121376K->10414K, 0.0650971 secs]
-XX:+PrintGCDetails [GC [DefNew: 8614K->781K(9088K), 0.0123035 secs] 118250K->113543K(130112K), 0.0124633 secs] [GC [DefNew: 8614K) 8614 k – > 8614 k (9088 k), 0.0000665 secs] [Tenured: 112761K->10414K(121024K), 0.0433488 secs] 121376K->10414K(130112K), 0.0436268 secs]
-XX:+PrintGCTimeStamps
-XX:+PrintGC:PrintGCTimeStamps 11.851: [GC 98328K->93620K(130112K), 0.0082960 secs]
-XX:+PrintGCApplicationStoppedTime Prints the time the program is paused during garbage collection. Can be mixed with the above Total time for which Application Threads were Stopped: 0.0468229 seconds
-XX:+PrintGCApplicationConcurrentTime Prints the uninterrupted execution time of the program before each garbage collection. Can be mixed with the above Application time: 0.5291524 seconds
-XX:+PrintHeapAtGC Prints detailed stack information before and after GC
-Xloggc:filename Record relevant log information in a file for analysis. Use with the above ones
-XX:+PrintClassHistogram garbage collects before printing the histogram.
-XX:+PrintTLAB View TLAB space usage
XX:+PrintTenuringDistribution Check the threshold for new life cycles after each Minor GC Desired Survivor size 1048576 bytes, new Threshold 7 (Max 15) New threshold 7 that is, the threshold for the new survival period is 7.

6. Main tools

6.1 JDK Tools

The JDK comes with a number of performance monitoring tools that you can use to monitor systems and troubleshoot memory performance issues.

6.2. Linux Command-line tools

Performance monitoring and troubleshooting are usually performed using the command line tools of the operating system.

The command instructions
top Displays information about the CPU usage, memory usage, and system load of running processes in real time
vmstat Monitors the virtual memory, processes, and CPU activities of the operating system
pidstat Monitors context switches for specified processes
iostat Monitoring Disk IO

There are also third-party monitoring tools for performance analysis and troubleshooting, such as MAT, GChisto, JProfiler, and Arthas.

7. Common tuning strategies

Again, it’s important to note that when you decide to tune your JVM, don’t fall into the trap of knowing that you can improve performance with an optimizer, and still prefer the optimizer.

7.1. Choose an appropriate garbage collector

CPU core, then Serial garbage collector is your only choice.

CPU multi-core, focus on throughput, then choose PS+PO combination.

Multiple CPU cores, user pause times, JDK 1.6 or 1.7, CMS.

If the CPU is multi-core, user pause time is concerned, JDK1.8 or higher, and the JVM has more than 6GB of available memory, choose G1.

Parameter configuration:

 // Set the Serial garbage collectorOpen: - XX: + UseSerialGCThe Parallel Avenge avenge will use the Parallel Old collectorOpen - XX: + UseParallelOldGC//CMS garbage collector (old age)Open - XX: + UseConcMarkSweepGC// Set up G1 garbage collectorOpen - XX: + UseG1GCCopy the code

7.2. Adjust the MEMORY size

Symptom: Garbage collection is very frequent.

Reason: If memory is too small, frequent garbage collection is required to free up enough space to create new objects, so the effect of increasing heap memory size is very obvious.

Note: If the number of garbage collections is very high and the number of objects collected at a time is very small, then it is not that the memory is too small, but that memory leaks are causing the objects not to be collected, resulting in frequent GC.

Parameter configuration:

 // Set the initial heap valueinstruction1: - Xms2g instructions2: - XX: InitialHeapSize = 2048 m// Set the maximum heap sizeinstruction1: ` - Xmx2g ` instructions2: - XX: MaxHeapSize = 2048 m// New generation memory configurationinstruction1: - Xmn512m instructions2: - XX: MaxNewSize = 512 mCopy the code

7.3. Set the expected pause time

Symptom: Program indirection lag

Reason: If there is no precise pause time setting and garbage collector is throughput oriented, garbage collection times can be erratic.

Note: Do not set unrealistic pause times, as a shorter one means more GC cycles to collect the original amount of garbage.

Parameter configuration:

 // The GC pause time, which the garbage collector tries to achieve by various means
 -XX:MaxGCPauseMillis 
Copy the code

7.4 Adjust the memory area size ratio

Symptom: GC is frequent in one area, normal in others.

Reason: If the corresponding region is running out of space and frequent GC is required to free up space, you can adjust the size ratio of the corresponding region if the JVM heap memory cannot be increased.

Note: It may not be a lack of space, but a memory leak that causes memory not to be reclaimed. This leads to frequent GC.

Parameter configuration:

 // Ratio of survivor zone to Eden zone sizeInstruction: - XX: SurvivorRatio =6  // The ratio of S-block and Eden block to Cenozoic era is 1:6, and the ratio of two S-block is 2:6// The proportion of the new generation and the old
 -XX:NewRatio=4  // Indicates the new generation: old age = 1:4, that is, the old age accounts for 4/5 of the whole heap; The default value = 2
Copy the code

7.5 Adjust the age of the object in the old age

Symptom: In the old days, GC is frequent, and many objects are collected each time.

Reason: if the rising generation age is small, the object of the new generation will soon enter old age, lead to more old s object, the object is in the following is a short period of time can be recycled, at this time of the object can be adjusted to upgrade generation age, let the object is not so easy to enter the old s to solve the problem there is insufficient space on the old s frequent GC.

Note: increasing the age of these objects in the Cenozoic may lead to an increase in GC frequency in the Cenozoic, and frequent copying of these objects may also lead to a longer GC time in the Cenozoic.

Configuration parameters:

// To enter the minimum GC age of the old age, the young generation object is converted to the minimum age of the old age object. The default value is 7
 -XX:InitialTenuringThreshol=7 
Copy the code

7.6. Adjust the standards of large objects

Symptom: In the old days, GC is frequent, and many objects are reclaimed each time, and the volume of a single object is relatively large.

Cause: If a large number of large objects are allocated directly to the old age, the old age is easily filled and causes frequent GC, you can set the criteria for the object to enter the old age directly.

Note: These large objects entering the Cenozoic may increase the frequency and duration of Cenozoic GC.

Configuration parameters:

 // The maximum number of objects that the new generation can hold will be allocated to the old generation. 0 indicates no limit.
  -XX:PretenureSizeThreshold=1000000 
Copy the code

7.7 adjust the trigger time of GC

Symptom: CMS, G1 often Full GC, program lag serious.

The reason: G1 and concurrent GC stage for that part of the CMS business working thread and garbage collection threads, also means that the garbage collection in the process of the business will generate a new thread object, so when the GC need to set aside part of the memory space to accommodate new objects, if this time memory space is not enough to accommodate new objects, The JVM stops concurrent collection and suspends all business threads (STW) to keep garbage collection running. At this point, you can adjust the timing of GC firing (for example, 60% in the old days) so that enough space can be set aside for objects created by business threads to have enough space to allocate.

Note: Triggering GC early increases the frequency of old GC.

Configuration parameters:

 The default is 68%. If SerialOld stutons occur frequently, it should be reduced
 -XX:CMSInitiatingOccupancyFraction
 ​
 // old locale usage thresholds to include in G1 mixed garbage collection cycle. The default occupancy rate is 65%
 -XX:G1MixedGCLiveThresholdPercent=65 
Copy the code

7.8. Adjust the JVM local memory size

Symptom: The number, time, and objects collected by GC are normal. The heap memory is sufficient, but OOM is reported

The JVM also has an out-of-heap memory, which is also called local memory, but it does not trigger GC when the local memory is insufficient. If the local memory is insufficient, it will reclaim the local memory.

Note: In addition to the above phenomena, the exception message may be OutOfMemoryError: Direct Buffer Memory. In addition to adjusting the local memory size, you can also catch this exception and manually trigger the GC (system.gc ()).

Configuration parameters:

 XX:MaxDirectMemorySize
Copy the code

Example of JVM tuning

Here are some examples of JVM tuning that collates from the network:

8.1 After the surge of website traffic, the response page of the website is very slow

1. The problem is speculated: the test speed is relatively fast in the test environment, but it slows down once it comes to production, so it is speculated that the business thread pauses due to garbage collection.

2. Positioning: In order to confirm the correctness of the prediction, we can see from the jstat-GC instruction that THE FREQUENCY of GC performed by JVM is very high and the gc takes a very long time. Therefore, the basic inference is that the high GC frequency leads to frequent pauses of business threads and slow response of web pages.

3. Solution: Due to the high traffic of web pages, object creation speed is very fast, resulting in the heap memory is easy to fill up and frequent GC, so the problem here is that the new generation memory is too small, so we can increase the JVM memory here, so the initial increase from 2G memory to 16G memory.

4, the second problem: after increasing the memory, it is true that the usual request is faster, but there is another problem, that is, the irregular intermittent card, and the single time of the card is much longer than before.

5, the problem is speculated: the previous optimization increased the memory, so it is speculated that the memory increased, resulting in a longer time for a single GC, leading to the indirect lag.

6. Positioning: Through the jstat-GC command, it is true that the number of FGC is not very high, but the time spent on FGC is very high. According to the GC log, the time spent on a single FGC reaches tens of seconds.

7. Solutions: Since the JVM uses the PS+PO combination by default, the PS+PO garbage tag and the collection phase are both STW, so as memory increases, the time required for garbage collection will be longer. Therefore, to avoid a long single GC, we need to change the collector for concurrent classes, because the current JDK version is 1.7. Therefore, THE CMS garbage collector was finally selected, and an expected pause time was set according to the previous garbage collection situation. After the launch, there was no lag problem on the website.

8.2 OOM caused by Background Export Data

Description of ** problem: the background system of ** company occasionally causes OOM exception and the heap memory overflow.

1. Because it was accidental, it was simply considered to be caused by insufficient heap memory for the first time, so the heap memory was unilaterally increased from 4G to 8G.

2, but the problem is still not solved, can only from the heap memory information, through the opened – XX: + HeapDumpOnOutOfMemoryError parameters to obtain the heap memory dump file.

3. VisualVM analyzes heap dump files, using VisualVM to see that the object that occupies the largest memory is String objects, originally wanted to trace String objects to find its reference place, but the dump file is too large, always stuck when tracing into the file. And the String object occupation is also quite normal, at the beginning did not identify the problem here, so from the thread information to find a breakthrough point.

4. Through analysis of threads, I first found several running business threads, followed up the business threads one by one, looked at the code, and found a method that caught my attention: export order information.

5, because the order information export method may have tens of thousands of data, the first thing is to query the order information from the database, and then the order information into Excel, this process will generate a large number of String objects.

6, in order to verify their conjecture, so ready to login back to test, the results in the process of testing found order button front didn’t do gray interaction events after click on the button, the button can always points, because the export order data is inherently very slowly, long after using personnel may find click page after all have no reaction, the result has been point, As a result, a large number of requests went into the background, and the heap generated a large number of order objects and EXCEL objects, and the method execution was so slow that none of these objects could be collected for a while, so the memory eventually ran out.

7, know that would be easy to solve the problem, finally do not have any JVM parameter adjustment, just on the front end of export orders button with the grey state, such as the back-end response later button to click on, and then reduce the query order information of fields necessary to reduce the volume of the object is generated, then the problem solved.

8.3. Excessive single cache data leads to system CPU surge

1. After the release of the system, it was found that the CPU had soared to 600%. The first thing to do after discovering this problem was to locate which application occupied the highest CPU.

2. If the application’s CPU is too high, it’s probably due to lock resource competition or frequent GC.

If GC is normal, then check from the point of view of the thread. First, print the GC information using the jstat-GC PID command. The result shows that the GC statistics obtained are obviously abnormal. So it’s obvious that frequent GC is causing CPU spikes.

The next step is to find out the cause of frequent GC, so you can either find out where objects are frequently created, or you can find out if there is a memory leak.

Jmap-dump dump = jmap-dump = jmap-dump = jmap-dump = jmap-dump = jmap-dump = jmap-dump = jmap-dump = jmap-dump = jmap-dump = jmap-dump

6, dump the heap memory information down, use visualVM for offline analysis, first from the most memory occupying object search, the result ranks third see a service VO occupies about 10% of the heap memory space, it is obvious that this object is problematic.

7, find the corresponding business through business object code, through the analysis of the code found a suspicious of this business object is to look at the news information generated by the object, because want to improve the efficiency of the query, so keep the news information to redis cache inside, each invocation information interface from the cache to get inside.

8, save the news to redis cache in this way is no problem, the problem is that the news of more than 50,000 data are saved in a key, so that each call query news interface will be from redis to take out more than 50,000 data, and then do filtering page out 10 returned to the front end. More than 50,000 pieces of data means more than 50,000 objects will be generated. Each object is about 280 bytes, and there are 13.3m objects for each object. This means that at least 13.3m objects will be generated as long as the news information is checked once. Such large objects are allocated directly to the old generation, so that a 2G old memory can fill up in a matter of seconds, triggering GC.

After 9, to know what is the problem then would be easy to solve, the problem results from a single cache is too large, so only need to minish the cache line, here only need to turn the page cache to the granularity of cache line, 1 per 10 key cache as returned to the front pages of data, so every time the query news information will only be took 10 data from the cache, This problem is avoided.

8.4. CPU always has 100% problem location

Problem analysis: A high CPU must be a program that occupies CPU resources for a long time.

1, so first need to find out which to use the highest CPU.

Top Lists the resource usage of each system process.Copy the code

2, and then according to the corresponding process to find which thread occupies the highest CPU.

Top-hp process ID Lists resources occupied by threads in the processCopy the code

3. Find the thread ID and print the stack information of the thread

Printf "%x\n" PID converts the thread ID to hexadecimal. Jstack PID Displays information about all threads in the process and finds information about the thread whose ID was converted to hexadecimal in the last step.Copy the code

4. Finally, locate the specific business method according to the stack information of the thread and find the problem from the code logic.

If the thread is in the watting state for a long time, watch watting on XXXXXX, indicating that the thread is waiting for the lock, and then locate the thread holding the lock based on the address of the lock.Copy the code

8.5. Positioning of memory overload problem

If a Java process creates a large number of objects, the garbage collection can’t keep up with the speed of object creation, or the object can’t be collected due to a memory leak.

1, first observe the situation of garbage collection

Jstat -gc PID 1000 Displays the number of GC counts and time every second. Jmap - histo PID | head - 20 view heap memory footprint the largest first 20 object types, object preliminary view which takes up memory.Copy the code

If each GC is frequent and the amount of memory collected is normal, it is because the object creation speed is fast and the memory usage is always high. If a very small amount of memory is reclaimed each time, it is likely that memory leaks have prevented it from being reclaimed.

2. Export the heap memory file snapshot

Jmap-dump :live,format=b,file=/home/myheapdump.hprof PID Dumps heap memory information to a file.Copy the code

3. Use visualVM to perform offline analysis on dump file, find the object that occupies high memory, and then find the location of the business code that creates the object, and locate specific problems from the code and business scenarios.

8.6. Data analysis platform system is frequently Full GC

The platform mainly conducts regular analysis and statistics on users’ behaviors in App, supports report export and uses CMS GC algorithm.

The data analyst found in the use of the system page open often lag, through the jstat command found that the system after each Young GC about 10% of the surviving objects into the old age.

This is because the Survivor region space is set too small. After each Young GC, the surviving objects cannot be placed in the Survivor region and enter the old age in advance.

The Survivor zone is enlarged so that the Survivor zone can accommodate objects that survive the Young GC, and the object goes through the Survivor zone many times before reaching the age threshold before entering the old age.

After adjustment, the live objects that enter the old age after each Young GC run stably at only a few hundred Kb, and the Full GC frequency is greatly reduced.

8.7. Service Interconnection Gateway OOM

The gateway mainly consumes Kafka data, performs data processing calculations and forwards the data to another Kafka queue. The system runs for a few hours and then restarts for a few hours.

Export heap memory through JMAP, and find out the reason in Eclipse MAT tool analysis: the code will be a service Kafka topic data asynchronous log printing, the service data volume is large, a large number of objects piled up in the memory waiting to be printed, resulting in OOM.

8.8 The authentication system is frequently Full GC for a long time

The System provides a variety of account authentication services to the outside world, but it is often found that the service is not available when using the System. Through the monitoring platform of Zabbix, it is found that the System is frequently Full GC for a long time, and the old heap memory is usually not fully occupied when triggered. It is found that system.gc () is called in the business code.





Reference:

[1] : Understanding the Java Virtual Machine: Advanced JVM Features and Best Practices (Ed.) by Zhiming Zhou

[2] : Practical JAVA Virtual Machine JVM Fault Diagnosis and Performance Optimization

[3] : Details of JVM performance tuning

[4] : How to properly plan a JVM performance tuning

[5] : This article is enough for GC principles and performance tuning practices

[6] : Java application performance tuning practices

[7] : JVM Combat: JVM tuning strategy

[8] : Do general Java projects require JVM tuning?

[9] : Java8 JVM parameter interpretation

[10] : JVM parameter setting – JDk8 parameter setting

[11] : JVM interview questions series: Common parameters for JVM configuration and common GC tuning strategies

[12] : Java1.8 JVM parameters official website address