Provide a systematic and comprehensive summary of JVM knowledge.

preface

I understand that the JVM should be a high-level knowledge of Java. I learned it from my blog, and then went over the highlights with the book Understanding the Java Virtual Machine in Depth. After half a month’s study, I give my study feelings.

I feel that learning through blogs is actually enough, because the knowledge in blogs hits the bull ‘s-eye of the JVM and is mostly dry stuff, but for some real-world use cases, blogs don’t give many examples, which is the only drawback of learning through blogs.

JVM knowledge through books, really feel the book is too thick, not recommended for new learning, because the content of the first five chapters in this book also is dry, behind much of the knowledge can skip directly, or as the elementary knowledge, such as your study book of a “virtual machine bytecode execution engine”, “front end compilation and optimization” and other chapters, I feel that there is no need to learn these knowledge, they are all “eight essay”, can wait for the need to learn. However, the book can be used as a supplement to the blog learning, because many of the practical examples are very good, and some key knowledge is also explained in detail, such as “garbage collector”.

Now let’s sum up the knowledge of this series

Class loading process

Loading Process introduction

If the JVM wants to execute the.class file, we need to load it into a classloader, which acts like a porter, moving all the.class files into the JVM.

Key knowledge:

  • Java files are compiled to become.class bytecode files
  • Bytecode files are transported to the JVM virtual machine through the class loader
  • Virtual machine main 5 blocks: method area, heap are thread shared area, there are thread safety issues, stack and local method stack and counter are independent area, there is no thread safety issues, and THE JVM tuning is mainly around the heap, stack two blocks.

Class loading process

The process of class loading includes five stages: loading, verification, preparation, parsing and initialization. Of the five phases, the loading, validation, preparation, and initialization phases occur in a certain order, while the parsing phase does not, and in some cases can begin after the initialization phase, in order to support runtime binding (also known as dynamic binding or late binding) in the Java language. Also note that the phases here start in sequence, not proceed or complete sequentially, as these phases are often intermixed with each other, often invoking or activating one phase while the other is executing.

  • Load, find and load the binary data of the Class, and also create an object of java.lang.Class in the Java heap
  • Connection, connection contains three parts: verification, preparation, initialization. 1) Verification, file format, metadata, bytecode, symbol reference verification; 2) Prepare, allocate memory for class static variables and initialize them to default values; 3) Parse and convert symbolic references in the class to direct references
  • Class to assign the correct initial value to a static variable of the class
  • Use, new out of the object program used
  • Unload and perform garbage collection

Class loader

The order in which a Class is loaded is also prioritized. The Class loader starts at the bottom and goes up like this:

  • The BootStrap this: rt. The jar
  • Extention ClassLoader: loads the extended JAR package
  • App ClassLoader: Jar package under the specified classpath
  • Custom ClassLoader: Custom ClassLoader

The garbage collection

How do I determine if an object is dead

In general, there are two ways to determine if an object has been destroyed:

  • Reference counting algorithm: adds a reference counter to an object that increments by one each time the object is referenced in one place; Each time an object reference is invalidated, the counter decreases by 1. When the counter is 0, the object is not referenced.
  • Reachability analysis algorithm: Searches along the reference chain starting with a series of root nodes called “GC Roots”, and objects on the reference chain are not recovered.

As shown in the figure above, green objects that are on GC Roots’ reference chain are not collected by the garbage collector, and gray objects that are not on GC Roots’ reference chain are considered recyclable.

Garbage collection algorithm

Tag – Cleanup algorithm

Mark by name – The cleanup algorithm marks invalid objects and then clears them.

Replication algorithm

The mark-copy algorithm splits the Java heap into two pieces, uses only one piece for each garbage collection, and then moves all the surviving objects to another area.

Tag – Collation algorithm

Tagging – The collation algorithm is a kind of compromise garbage collection algorithm that performs the same steps as the previous two in the process of tagging objects. However, after the marking, the live objects are moved to one end of the heap, and the area outside the live objects is cleaned up directly. In this way, memory fragmentation is avoided, and there is no waste of heap space. However, every time garbage collection is performed, all user threads are suspended, especially for older objects, which takes longer to collect, which is very bad for the user experience.

Garbage collector

Serial collector

The Serial collector is the most basic and oldest collector. It is a single-threaded collector. With Serial, all application threads are suspended when the heap space is cleared, whether for Minor or Full GC.

ParNew collector

The ParNew collector is essentially a multithreaded parallel version of the Serial collector, in addition to using multiple threads simultaneously for garbage collection, The rest of The behavior includes all The control parameters available to The Serial collector, collection algorithms, Stop The World, object allocation rules, reclaim policies, and so on are exactly The same as The Serial collector.

Parallel avenge

The Parallel Collector is a new generation collector based on the mark-copy algorithm. The Parallel Collector is a multi-threaded collector that can collect in Parallel very similar to ParNew.

The goal of the Parallel Insane is to achieve a controlled Throughput. Throughput is the ratio of the time the processor spends running user code to the total elapsed time of the processor. If the virtual machine completes a task and the user code plus garbage collection takes 100 minutes, of which garbage collection takes 1 minute, the throughput is 99%.

Serial Old collector

Serial Old is an older version of the Serial collector, which is also a single-threaded collector using a mark-collation algorithm.

Parallel Old collector

Parallel Old is an older version of the Parallel Avenge collector, supported by multiple threads for concurrent collection and implemented on a mark-collation algorithm.

CMS collector

The CMS collector was designed to eliminate long pauses in the Parallel collector and Serial collector Full GC cycles. The CMS collector pauses all application threads during the Minor gc and performs garbage collection in a multithreaded manner.

Garbage collector comparison

Runtime data area

What is a runtime data area?

At runtime, a Java program allocates a separate memory area for the JVM, which in turn partitions a runtime data area. The runtime data area can be roughly divided into five parts:

Java Heap

Stack tube runs, heap tube stores. The virtual stack runs the code and the virtual heap stores the data.

The Java heap area has the following characteristics:

  • We store objects from new, not primitive types and object references.
  • Because of the large number of objects created, the garbage collector works mainly in this area.
  • Threads share the area and are therefore thread-unsafe.
  • An OutOfMemoryError can occur.

In fact, the Java heap can be divided into new generation and old generation, and the new generation can be further divided into Eden region, Survivor 1 region, Survivor 2 region.

JVM Stacks

The Java virtual machine stack is also an area of focus for developers. Again, let’s start with the dry stuff:

  • The Java virtual machine stack is thread private, with each thread owning a virtual machine stack that has the same lifetime as the thread.
  • Virtual machine Stack describes the memory model of Java method execution: each method execution creates a Stack Frame to store information such as local variation table, operation Stack, dynamic link, method exit, etc. Each method is called until the execution is complete, corresponding to the process of a stack frame in the virtual machine stack from the stack to the stack.
  • Holds basic data types (Boolean, byte, CHAR, short, int, float, long, double) and references to objects (the reference type, which is not the same as the object itself, depending on the virtual machine implementation, may be a reference pointer to the object’s starting address, It may also point to a handle representing an object or other location associated with that object) and the returnAddress type (which points to the address of a bytecode instruction).
  • There are two possible exceptions in this area: StackOverflowError is thrown if the thread requests a stack depth greater than the virtual machine allows; If the virtual machine stack can be dynamically expanded, an OutOfMemoryError is raised when sufficient memory cannot be allocated during the expansion.

Native Method Stacks

The role of the Native method stack is very similar to that of the virtual machine stack, except that the virtual machine stack performs Java method (that is, bytecode) services for the virtual machine, while the Native method stack serves the Native methods used by the virtual machine.

The virtual machine specification does not mandate the language, usage, or data structure of methods in the local method stack, so specific virtual machines are free to implement it. There are even virtual machines (such as the Sun HotSpot VIRTUAL machine) that simply merge the local method stack with the virtual machine stack. Like the virtual stack, the local method stack area throws StackOverflowError and OutOfMemoryError exceptions.

Method Area

The method area is also a focus area with the following characteristics:

  • Threads share an area, so this is a thread-unsafe area.
  • It is used to store information about classes that have been loaded by the virtual machine, constants, static variables, code compiled by the just-in-time compiler, and so on.
  • OutOfMemoryError is thrown when the method area cannot meet memory allocation requirements.

Program Counter Register

All it does is record where the current thread is executing. Thus, when the thread regains CPU execution, it executes directly from the recorded location, and branches, loops, jumps, and exception handling depend on the program counter.

The JVM heap memory

Heap memory structure

Java heap regions can be divided into new generation and old generation, and the new generation can be further divided into Eden region, Survivor 1 region, Survivor 2 region. For specific scale parameters, you can look at this picture.

GC type

  • Minor /Young GC: Garbage collection for the new generation;
  • Major GC/Old GC: Garbage collection for older generations.
  • Full GC: Garbage collection for the entire Java heap and method areas.

How Minor GC works

Typically, objects created for the first time are stored in the Eden region of the new generation, and when the first Minor GC is triggered, the surviving objects in Eden region are moved to a region in Survivor region. The next time the Minor GC is triggered, objects in the Eden zone are moved to another Survivor zone, along with objects in one Survivor zone. As you can see, we only use one of the two Survivor zones at a time, wasting only one Survivor zone.

Full GC works

The old age is where long-lived objects are stored, and when it fills up, it triggers the Full GC, the most commonly heard of GC, during which all threads are stopped waiting for the GC to complete. For response-demanding applications, Full GC should be minimized to avoid response timeouts.

The GC log

Enabling GC Logs

Steal the lazy, directly posted online content:

Understanding GC Logs

Minor GC logs:

Full GC log:

JVM commands

The Sun JDK monitoring and troubleshooting commands are JPS, jstat, jmap, jhat, jstack, and jinfo

jps

JVM Process Status Tool, which displays all HotSpot VIRTUAL machine processes in a specified system.

jstat

Jstat (JVM Statistics Monitoring) is a command used to monitor the status of a virtual machine while it is running. It shows the running data of a virtual machine, such as class loading, memory, garbage collection, JIT compilation, and so on.

jmap

Dump heap to file, can be used for file analysis.

jhat

The JHAT (JVM Heap Analysis Tool) command is used with Jmap to analyze dump results generated by Jmap. Jhat has a built-in miniature HTTP/HTML server. After dump Analysis results are generated, you can view them in a browser. Note that the dump file generated by the server is usually copied to local or other machines for analysis because jHAT is a time-consuming and resource-consuming process.

jstack

Jstack is used to generate a thread snapshot of the Java VIRTUAL machine at the current time. A thread snapshot is a collection of method stacks that are being executed by each thread in the Java VIRTUAL machine (JVM). The main purpose of a thread snapshot is to locate the cause of a long pause in a thread, such as deadlocks, loops, and long waits caused by requests for external resources. By looking at the call stack of each thread when it pauses, you can see what the unresponsive thread is doing in the background, or what resources it is waiting for.

Performance testing tool

jconsole

Java Monitoring and Management Console (Jconsole) is a Java Monitoring and Management Console built into the JDK from Java 5. It is used to monitor memory, threads, and classes in the JVM. JMX is a GUI performance monitoring tool based on Java Management Extensions (JMX). Jconsole uses the JVM’s extension mechanism to capture and present information about the performance and resource consumption of applications running in a virtual machine.

Overview: a graph of heap memory usage, threads, classes, and CPU usage.

Threads: The visual equivalent of the jstack command, you can also click “Detect deadlocks” to check whether there are deadlocks between threads.

VisualVM

VisualVM (All-In-one Java Troubleshooting Tool) is One of the most powerful running monitoring and fault handling programs, and has been the official mainforce of Oracle vm Troubleshooting for a long period of time.

VisualVM has one big advantage over some third party tools: programs that don’t need to be monitored are run on special agents, so it is versatile and has little impact on actual application performance, making it directly applicable to production environments.

Visual GC is a common use of a function, need to plug-in according to, can obviously see the young generation, the old generation of memory changes, as well as GC frequency, GC time, etc., feel this plug-in is very cool!

The home page for monitoring is essentially a diagram of CPU, memory, classes, and threads in which heap dump can be performed.

Finally, heap dump:

The JVM tuning

Choose the appropriate garbage collector

  • CPU core, then Serial garbage collector is your only choice.
  • CPU multi-core, focus on throughput, then choose PS+PO combination.
  • Multiple CPU cores, user pause times, JDK 1.6 or 1.7, CMS.
  • If the CPU is multi-core, user pause time is concerned, JDK1.8 or higher, and the JVM has more than 6GB of available memory, choose G1.

Parameter configuration:

// Set Serial garbage collector to open: To be insane, use the Parallel Avenge collector. -xx :+UseParallelOldGC //CMS garbage collector (Old age) open -xx :+UseConcMarkSweepGC // Set G1 garbage collector on -xx :+UseG1GCCopy the code

Adjusting memory Size

Symptom: Garbage collection is very frequent.

Reason: If memory is too small, frequent garbage collection is required to free up enough space to create new objects, so the effect of increasing heap memory size is very obvious.

Note: If the number of garbage collections is very high and the number of objects collected at a time is very small, then it is not that the memory is too small, but that memory leaks are causing the objects not to be collected, resulting in frequent GC.

Parameter configuration:

// Set heap initial value directive 1: -xMS2G directive 2: -xx :InitialHeapSize=2048m // Set heap maximum directive 1: '-XMx2G' directive 2: -xx :MaxHeapSize=2048m // New generation memory configuration directive 1: -xmn512m command 2: -xx :MaxNewSize= 512MCopy the code

Set a pause time that matches your expectations

Symptom: Program indirection lag

Reason: If there is no precise pause time setting and garbage collector is throughput oriented, garbage collection times can be erratic.

Note: Do not set unrealistic pause times, as a shorter one means more GC cycles to collect the original amount of garbage.

Parameter configuration:

// The GC pause time, which the garbage collector tries to achieve by various means -xx :MaxGCPauseMillisCopy the code

Adjust the memory area size ratio

Symptom: GC is frequent in one area, normal in others.

Reason: If the corresponding region is running out of space and frequent GC is required to free up space, you can adjust the size ratio of the corresponding region if the JVM heap memory cannot be increased.

Note: It may not be a lack of space, but a memory leak that causes memory not to be reclaimed. This leads to frequent GC.

Parameter configuration:

// Survivor zone and Eden zone size ratio command: -xx :SurvivorRatio=6 // Ratio of Cenozoic and Eden zones is 1:6, and ratio of Cenozoic and old age in two S zones is 2:6 // -xx :NewRatio=4 // Indicates Cenozoic: old age = 1:4, that is, old age accounts for 4/5 of the whole heap; The default value = 2Copy the code

Adjusts the age of the object in the old age

Symptom: In the old days, GC is frequent, and many objects are collected each time.

Reason: if the rising generation age is small, the object of the new generation will soon enter old age, lead to more old s object, the object is in the following is a short period of time can be recycled, at this time of the object can be adjusted to upgrade generation age, let the object is not so easy to enter the old s to solve the problem there is insufficient space on the old s frequent GC.

Note: increasing the age of these objects in the Cenozoic may lead to an increase in GC frequency in the Cenozoic, and frequent copying of these objects may also lead to a longer GC time in the Cenozoic.

Configuration parameters:

The smallest GC / / into the old s age, young generation object into old s minimum age value, the default value 7 - XX: InitialTenuringThreshol = 7Copy the code

Adjust the criteria for large objects

Symptom: In the old days, GC is frequent, and many objects are reclaimed each time, and the volume of a single object is relatively large.

Cause: If a large number of large objects are allocated directly to the old age, the old age is easily filled and causes frequent GC, you can set the criteria for the object to enter the old age directly.

Note: These large objects entering the Cenozoic may increase the frequency and duration of Cenozoic GC.

Configuration parameters:

// The maximum number of objects that the new generation can hold will be allocated to the old generation. 0 indicates no limit. -XX:PretenureSizeThreshold=1000000Copy the code

Adjust the timing of GC triggering

Symptom: CMS, G1 often Full GC, program lag serious.

The reason: G1 and concurrent GC stage for that part of the CMS business working thread and garbage collection threads, also means that the garbage collection in the process of the business will generate a new thread object, so when the GC need to set aside part of the memory space to accommodate new objects, if this time memory space is not enough to accommodate new objects, The JVM stops concurrent collection and suspends all business threads (STW) to keep garbage collection running. At this point, you can adjust the timing of GC firing (for example, 60% in the old days) so that enough space can be set aside for objects created by business threads to have enough space to allocate.

Note: Triggering GC early increases the frequency of old GC.

Configuration parameters:

/ / how much proportion of the old s CMS collection, the default is 68%, if frequent SerialOld caton, should be down - XX: CMSInitiatingOccupancyFraction / / G1 mixed garbage collection cycles to be included in the old area of the occupancy rate threshold value set. The default occupancy rate of 65% - XX: G1MixedGCLiveThresholdPercent = 65Copy the code

Adjust the JVM local memory size

Symptom: The number, time, and objects collected by GC are normal. The heap memory is sufficient, but OOM is reported

The JVM also has an out-of-heap memory, which is also called local memory, but it does not trigger GC when the local memory is insufficient. If the local memory is insufficient, it will reclaim the local memory.

Note: In addition to the above phenomena, the exception message may be OutOfMemoryError: Direct Buffer Memory. In addition to adjusting the local memory size, you can also catch this exception and manually trigger the GC (system.gc ()).

Configuration parameters:

 XX:MaxDirectMemorySize
Copy the code

JVM Debugging

The site responded slowly to a surge in traffic

1. The problem is speculated: the test speed is relatively fast in the test environment, but it slows down once it comes to production, so it is speculated that the business thread pauses due to garbage collection.

2. Positioning: In order to confirm the correctness of the prediction, we can see from the jstat-GC instruction that THE FREQUENCY of GC performed by JVM is very high and the gc takes a very long time. Therefore, the basic inference is that the high GC frequency leads to frequent pauses of business threads and slow response of web pages.

3. Solution: Due to the high traffic of web pages, object creation speed is very fast, resulting in the heap memory is easy to fill up and frequent GC, so the problem here is that the new generation memory is too small, so we can increase the JVM memory here, so the initial increase from 2G memory to 16G memory.

4, the second problem: after increasing the memory, it is true that the usual request is faster, but there is another problem, that is, the irregular intermittent card, and the single time of the card is much longer than before.

5, the problem is speculated: the previous optimization increased the memory, so it is speculated that the memory increased, resulting in a longer time for a single GC, leading to the indirect lag.

6. Positioning: Through the jstat-GC command, it is true that the number of FGC is not very high, but the time spent on FGC is very high. According to the GC log, the time spent on a single FGC reaches tens of seconds.

7. Solutions: Since the JVM uses the PS+PO combination by default, the PS+PO garbage tag and the collection phase are both STW, so as memory increases, the time required for garbage collection will be longer. Therefore, to avoid a long single GC, we need to change the collector for concurrent classes, because the current JDK version is 1.7. Therefore, THE CMS garbage collector was finally selected, and an expected pause time was set according to the previous garbage collection situation. After the launch, there was no lag problem on the website.

The OOM is generated when the background data is exported

Description of ** problem: the background system of ** company occasionally causes OOM exception and the heap memory overflow.

1. Because it was accidental, it was simply considered to be caused by insufficient heap memory for the first time, so the heap memory was unilaterally increased from 4G to 8G.

2, but the problem is still not solved, can only from the heap memory information, through the opened – XX: + HeapDumpOnOutOfMemoryError parameters to obtain the heap memory dump file.

3. VisualVM analyzes heap dump files, using VisualVM to see that the object that occupies the largest memory is String objects, originally wanted to trace String objects to find its reference place, but the dump file is too large, always stuck when tracing into the file. And the String object occupation is also quite normal, at the beginning did not identify the problem here, so from the thread information to find a breakthrough point.

4. Through analysis of threads, I first found several running business threads, followed up the business threads one by one, looked at the code, and found a method that caught my attention: export order information.

5, because the order information export method may have tens of thousands of data, the first thing is to query the order information from the database, and then the order information into Excel, this process will generate a large number of String objects.

6, in order to verify their conjecture, so ready to login back to test, the results in the process of testing found order button front didn’t do gray interaction events after click on the button, the button can always points, because the export order data is inherently very slowly, long after using personnel may find click page after all have no reaction, the result has been point, As a result, a large number of requests went into the background, and the heap generated a large number of order objects and EXCEL objects, and the method execution was so slow that none of these objects could be collected for a while, so the memory eventually ran out.

7, know that would be easy to solve the problem, finally do not have any JVM parameter adjustment, just on the front end of export orders button with the grey state, such as the back-end response later button to click on, and then reduce the query order information of fields necessary to reduce the volume of the object is generated, then the problem solved.

Welcome everyone to like a lot, more articles, please pay attention to the wechat public number “Lou Zai advanced road”, point attention, do not get lost ~~