We’ve covered the entire JVM series before, with the ultimate goal of not only understanding the basics of the JVM, but also preparing for JVM performance tuning. This article walks you through JVM performance tuning.
Performance tuning
Performance tuning has multiple levels, such as: architecture tuning, code tuning, JVM tuning, database tuning, operating system tuning, and so on.
Architecture tuning and code tuning are the foundation of JVM tuning, with architecture tuning having the greatest impact on the system.
Performance tuning basically follows the following steps: Identify optimization objectives, identify performance bottlenecks, tune performance, obtain data through monitoring and statistical tools, and confirm that goals have been met.
When to perform JVM tuning
JVM tuning needs to be considered when:
- Heap memory (old) continues to rise to the set maximum memory value.
- Full GC frequency is frequent;
- The GC pause time is too long (more than 1 second);
- The memory, such as OutOfMemory, is abnormal.
- The application uses the local cache and occupies a large amount of memory space.
- The system throughput and response performance are low or degraded.
Basic principles of JVM tuning
JVM tuning is a tool, but not all problems can be solved by JVM tuning. Therefore, there are a few principles to follow when performing JVM tuning:
- Most Java applications do not require JVM optimization;
- Most causes of GC problems are at the code level (code level);
- Before going online, consider setting your machine’s JVM parameters to their best;
- Reduce the number of objects created (code level)
- Reduce the use of global variables and large objects (at the code level);
- Prioritize architecture and code tuning, JVM tuning as a last resort (code, architecture level);
- Analyzing GC and optimizing code is better than optimizing JVM parameters (at the code level);
Using the above principles, we found that the most effective optimizations are architecture and code level optimizations, while JVM optimizations are the last resort, or the last squeeze on the server configuration.
JVM tuning targets
The ultimate goal of tuning is to get the application to handle more throughput with minimal hardware consumption. JVM tuning focuses on collection performance optimization for the garbage collector, allowing applications running on a virtual machine to achieve greater throughput with less memory and latency.
- Delay: low GC pause and low GC frequency;
- Low memory usage;
- High throughput;
Among them, the improvement of any attribute performance is almost at the cost of sacrificing the performance of other attributes. According to its importance in the business.
The JVM fine-tunes the quantified goal
Some reference examples of quantified targets for JVM tuning are shown below:
- Heap memory usage <= 70%;
- Old Generation memory usage <= 70%;
- Avgpause <= 1 second;
- Full GC count 0 or AVG pause interval >= 24 hours;
Note: JVM tuning quantification goals vary from application to application.
Steps for JVM tuning
In general, JVM tuning can be done through the following steps:
- Analyze GC logs and dump files to determine whether optimization is required and determine the bottleneck.
- Determine JVM tuning quantification goals;
- Determine JVM tuning parameters (adjusted based on historical JVM parameters);
- Optimize memory, latency, and throughput in sequence.
- The difference before and after tuning was observed.
- Continue to analyze and adjust until you find the appropriate JVM parameter configuration;
- Find the most appropriate parameters, apply them to all servers, and follow up.
Some of the steps above require several iterations. Generally, it starts with satisfying the memory usage requirements of the program, followed by the time delay requirements, and finally the throughput requirements. Optimization is based on this step, and each step is the basis for the next step, which cannot be retrograde.
The JVM parameter
The most important tool for JVM tuning is JVM parameters. Let’s start by looking at JVM parameters.
The -xx parameter is called unstable parameter, and the setting of such parameter can easily cause the difference in JVM performance, making the JVM extremely unstable. If set properly, this type of parameter can greatly improve the performance and stability of the JVM.
The syntax rules for unstable parameters include the following.
Boolean type parameter values:
- -xx :+ ‘+’ indicates that this option is enabled
- -xx :- ‘-‘ disables this option
Numeric type Parameter values:
- -xx := Sets the option to a numeric value that can be followed by units, such as ‘m’ or ‘m’ for megabytes; ‘k’ or ‘k’ kilobytes; ‘G’ or ‘g’ gigabytes. 32K is the same size as 32768.
String parameter values:
- -xx := Sets an option to a string value, usually used to specify a file, path, or list of commands. For example: – XX: HeapDumpPath = / dump. The core
JVM parameter parsing and tuning
For example, the following parameter example:
-xmx4g - Xms4g - Xmn1200m - Xss512k -xx :NewRatio=4 -xx :SurvivorRatio=8 -xx :PermSize=100m -xx :MaxPermSize=256m -XX:MaxTenuringThreshold=15Copy the code
For Java7 and earlier examples, the permanent arguments -xx :PermSize and -xx: MaxPermSize are invalid in Java8. This was covered in the previous chapter.
Parameter analysis:
- -XMx4G: The maximum heap memory is 4GB.
- -xms4g: sets the initial heap memory size to 4GB.
- -Xmn1200m: sets the young generation size to 1200MB. Increasing the size of the young generation will reduce the size of the old generation. The value has a significant impact on system performance. Sun officially recommends that the value be 3/8 of the entire heap.
- -Xss512k: sets the stack size for each thread. After JDK5.0, the stack size of each thread is 1MB, whereas before, the stack size of each thread is 256K. This should be adjusted according to the amount of memory required by the application thread. With the same physical memory, reducing this value can generate more threads. However, the operating system has a limit on the number of threads in a process, which cannot be generated indefinitely. The experience value is about 3000~5000.
- -xx :NewRatio=4: Sets the ratio of the young generation (including Eden and two Survivor zones) to the old generation (excluding the persistent generation). If the value is set to 4, the ratio of the young generation to the aged generation is 1:4, and the young generation occupies 1/5 of the whole stack
- -xx :SurvivorRatio=8: sets the ratio of Eden and Survivor zones in the young generation. Set to 8, the ratio of two Survivor zones to one Eden zone is 2:8, and one Survivor zone accounts for 1/10 of the entire young generation
- -xx :PermSize=100m: the permanent generation size is initialized to 100MB.
- -xx :MaxPermSize=256m: sets the persistent size to 256MB.
- -xx :MaxTenuringThreshold=15: Sets the maximum age of garbage. If set to 0, the young generation object enters the old generation without passing through Survivor. For applications with more aged generations, efficiency can be improved. If this value is set to a large value, the young generation object will be copied multiple times in the Survivor area, which increases the lifetime of the object in the next young generation and increases the probability that the object will be reclaimed in the young generation.
If the parameters of the new generation, old generation, and permanent generation are not specified, the VM automatically selects appropriate values and adjusts the parameters based on the system cost.
Adjustable parameters:
-xms: specifies the initial heap memory size, which is 1/64 of the physical memory (less than 1GB) by default.
-Xmx: indicates the maximum heap memory. When the default (MaxHeapFreeRatio parameter can be adjusted) free heap memory is greater than 70%, the JVM reduces the heap to the minimum limit of -xms.
-xmn: Indicates the size of the new generation, including Eden zone and two Survivor zones.
-xx :SurvivorRatio=1: The ratio of Eden zone and Survivor zone is 1:1.
-xx :MaxDirectMemorySize=1 GB: direct memory. The Java. Lang. OutOfMemoryError: Direct buffer memory exceptions can increase this value.
-xx :+DisableExplicitGC: Disables explicit calls to System.gc() at run time to trigger fulll GC.
Note: the timing of Java RMI GC triggering mechanism can be configured – Dsun. RMI. DGC. Server gcInterval = 86400 to control the trigger time.
– XX: CMSInitiatingOccupancyFraction = 60: recycling old s memory threshold, the default value is 68.
-xx :ConcGCThreads=4: CMS garbage collector parallel threads. The recommended value is the number of CPU cores.
-xx :ParallelGCThreads=8: Indicates the number of threads of the new-generation parallel collector.
-xx :MaxTenuringThreshold=10: Sets the maximum age of garbage. If set to 0, the young generation object enters the old generation without passing through Survivor. For applications with more aged generations, efficiency can be improved. If this value is set to a large value, the young generation object will be copied multiple times in the Survivor area, which increases the lifetime of the object in the next young generation and increases the probability that the object will be reclaimed in the young generation.
– XX: CMSFullGCsBeforeCompaction = 4: specify how many times after fullGC, tenured area memory space compression.
– XX: CMSMaxAbortablePrecleanTime = 500: when abortable – preclean pre-cleaning stage performing at this time will be over.
If you are concerned about performance overhead, you should try to set the initial value of the permanent generation to the same value as the maximum value, because the permanent generation resizing requires FullGC.
Memory optimization Example
When the JVM is stable and FullGC is triggered, we usually get the following information:
In the above GC logs, the heap usage and gc time of the entire application at the time of fullGC. To be more accurate you need to collect multiple times and calculate the average. Or use the longest FullGC to estimate. In the figure above, the old age space occupies 93168KB (about 93MB), which is the active data of the old age space. The rest of the heap space is allocated based on the following rules.
- Java heap: parameters -xMS and -xmx, recommended to expand to 3-4 times the old space footprint after FullGC.
- Permanent generation: -xx :PermSize and -xx :MaxPermSize. It is recommended to expand the capacity of permanent tape to 1.2-1.5 times of FullGc.
- Cenozoic: -XMN, it is recommended to expand to 1-1.5 times the old space occupation after FullGC.
- Old age: 2-3 times the space occupied by old age after FullGC.
Based on the above rules, the parameters are defined as follows:
java -Xms373m -Xmx373m -Xmn140m -XX:PermSize=5m -XX:MaxPermSize=5mCopy the code
Delay optimization example
For delay-based optimization, the first step is to understand the delay-based requirements and the indicators that can be tuned.
- Average stasis time accepted by the application: This time is measured with the Minor
- GC duration for comparison. Acceptable Minor GC frequency: Minor
- The GC frequency is compared to the tolerable value.
- Maximum acceptable pause time: The maximum pause time is compared to the duration of the worst case FullGC.
- Maximum acceptable pause frequency: basically the FullGC frequency.
Among them, average pause time and maximum pause time are the most important for user experience. For the above indicators, the relevant data collection includes: the duration of MinorGC, the number of MinorGC statistics, the worst duration of FullGC, and the frequency of the worst case FullGC.
As shown above, the average duration of a MinorGC is 0.069 seconds, and the frequency of a MinorGC is 0.389 seconds.
The larger the Cenozoic space is, the longer the Minor GC takes and the lower the frequency. If you want to reduce its duration, you need to reduce its space size. If you want to reduce its frequency, you need to increase its space size.
The delay time is reduced by 10% by reducing the size of the Cenozoic space. In this process, the size of the old age and the generation should be kept unchanged. After tuning, the parameters change as follows:
java -Xms359m -Xmx359m -Xmn126m -XX:PermSize=5m -XX:MaxPermSize=5mCopy the code
Throughput tuning
Throughput tuning is primarily based on the throughput requirements of the application, which should have a comprehensive throughput metric derived from the requirements and testing of the entire application.
If it is around 20%, you can modify the parameters, add more memory, and debug again from scratch. If it is large, you need to consider the whole application level, whether the design and goal are consistent, and reevaluate the throughput goal.
For The garbage collector, The goal of performance tuning to improve throughput is to avoid or minimize The occurrence of FullGC or stop-the-world compressed garbage collection (CMS), since both of these methods can reduce application throughput. Try to recycle as many objects as possible during the MinorGC phase to prevent objects from being promoted to the old age too quickly.
Tuning tool
With the GCViewer log analysis tool, you can intuitively analyze the pending benefits. It can be analyzed from the following aspects:
Memory: Analyze the Memory usage of Totalheap, Tenuredheap, Youngheap, and other indicators. Theoretically, the smaller the Memory usage, the better.
Pause: Analyzes each indicator of Gc Pause, Fullgc Pause, and Total Pause. Theoretically, the fewer THE Gc times, the better. The shorter the Gc duration, the better.
JVM Performance Tuning in Detail
Reference:
(1) (2) https://blog.csdn.net/jisuanjiguoba/article/details/80176223 https://juejin.im/post/6844903506093015053
The Interviewer series:
- The JVM Internal-memory Structure in Detail
- Interviewer, stop asking me about Java GC.
- “Interviewer, Java8 JVM memory structure changed, permanent generation to meta space”
- Interviewer, Stop asking me about the Java Garbage Collector
- Java Virtual Machine class Loaders and Parent Delegation
- Java Memory Model (JMM) Details
- Java Memory Model Principles in Detail
- JVM Performance Tuning in Detail
New horizons for programs