Overview: Knowledge summary
There are six general directions for the JVM: memory model, class loading mechanism, and GC garbage collection. The performance tuning part focuses on practical application, with emphasis on practical ability. Compiler optimization and execution mode focus on theoretical basis, mainly grasp knowledge points.
The sections read as follows:
1 > Memory model part: program counter, method area, heap, stack, the role of local method stack, what data to store;
2 > Class loading part: the loading mechanism of parental delegation and which types of classes are loaded by common class loaders;
3 > GC part: the idea and basis of generational recycling, as well as the realization of different garbage collection algorithms, suitable scenarios;
4 > Performance tuning part: the role of common JVM optimization parameters, parameter tuning basis, to understand the common JVM analysis tools can analyze what kind of problems, and how to use;
5 > execution mode section: explain the pros and cons of compilation, blending, and learn about the layered compilation techniques provided by java7. You need to know JIT just-in-time compilation techniques and OSR (on-stack substitution), and know the scenarios for C1 and C2 compilers, where C2 is more aggressively optimized for server mode. On the new technology front, check out the Java-implemented Graal compiler provided by Java10.
6 > Compilation optimization: compilation process of front-end compiler JavAC, AST abstract syntax tree, compile-time optimization and run-time optimization. Common techniques for compiler optimization include elimination of common subexpression, method inlining, escape analysis, allocation on the stack, synchronization elimination, etc. Knowing this will help you write compilers friendly code.
The content of JVM is relatively concentrated, but the mastery of knowledge depth is highly required. It is suggested to focus on strengthening before the interview.
JVM memory
1. Detail – JVM memory model
The JVM memory model refers primarily to the data area at runtime and consists of five sections.
A stack, also known as a method stack, is thread-private. When a thread executes each method, it creates a stack frame, which is used to store information about local variables, operation stack, dynamic links, method exits, etc. A method is pushed when called, and a method is pushed when returned.
A native method stack is similar to a stack and is used to store information about methods executed by a thread. The difference is that Java methods are executed using a stack, while native methods are executed using a local method stack.
The program counter holds the bytecode position executed by the current thread, and each thread has a separate counter when working. The program counter serves the execution of Java methods and is empty when native methods are executed.
The stack, the local method stack, and the program counter are all thread exclusive.
The heap is the largest chunk of memory managed by the JVM. The heap is shared by all threads to house object instances, and almost all object instances are allocated here. An OOM exception is raised when there is no available heap memory. Depending on the lifetime of the object, the JVM manages the heap memory in generations, and the garbage collector manages the collection of objects.
The method area is also an area of memory shared by individual threads, also known as the non-heap area. Used to store data such as class information that has been loaded by the virtual machine, constants, static variables, code compiled by the just-in-time compiler, etc.
Permanent generation in jdk1.7 and metaspace in 1.8 are both implementations of method areas.
When answering questions about this topic, you should answer two main points: one is the function of each part, and the other is which threads are shared and which are exclusive.
2. Details – JMM memory visibility
The JMM is a Java memory model, separate from the JVM memory model. The main goal of the JMM is to define access rules for variables in the program. As shown in the figure, all shared variables are stored and shared in main memory. Each thread has its own working memory, which stores a copy of variables in the main memory. The operation of reading and writing variables must be carried out by the thread in its own working memory instead of directly reading and writing variables in the main memory.
In a multi-threaded data interaction, such as thread a to a Shared variable assignment, by a thread b to read the value, a modification of the variable is modified in their workspace memory, b is invisible, only from a workspace to write back to the main memory, b again from main memory can read into their own work area for further operation. Because of instruction reordering, this write-read order can be disturbed.
So the JMM needs to provide guarantees of atomicity, visibility, and order.
3, Details – JMM guarantee
This paper mainly introduces how to ensure the atomicity, visibility, order of JMM.
The JMM guarantees that reads and writes to the underlying data types other than long and double are atomic. The keyword Synchronized also provides atomicity assurance. The atomicity of Synchronized is guaranteed by Java’s two high-level bytecode instructions monitorenter and Monitorexit.
One guarantee of JMM visibility is through Synchronized and the other is volatile. Assignments of volatile mandatory variables are flushed synchronously back to main memory, and reads of volatile variables are reloaded from main memory, ensuring that different threads always see the latest value of the variable.
The JMM guarantees order primarily through volatile and a series of happens-before principles. Another use of volatile is to prevent instruction reordering, thus ensuring that variable reads and writes are ordered.
The happens-before principle includes a set of rules such as
-
The principle of program sequence, that is, semantic serialization must be guaranteed within a thread;
-
The lock rule that a lock must be unlocked before it is locked again;
-
The transitivity of the happens-before principle, thread start, interrupt, and termination rules are also included.
Class loading mechanism
1. Explain the class loading mechanism
Loading a class involves reading the bytecode from the compiled class file into memory, placing it in the method area and creating the corresponding class object.
Class loading is divided into loading, linking, initialization, which link includes verification, preparation, parsing three steps. Looking at the dark green in the top half of the picture, let’s break it down one by one:
-
Loading is the process of bringing files into memory. Find such a bytecode file by its fully qualified name and create a Class object from the bytecode file
-
Validation is the validation of class file contents. To ensure that the Class file meets the requirements of the current VM and does not compromise VM security. It mainly includes four kinds: file format verification, metadata verification, bytecode verification, symbol reference verification.
-
The preparation phase is memory allocation. Allocate memory for class variables, that is, static variables in the class, and set their initial values. Note that the initial values are 0 or NULL, not the actual values set in the code, which are done during initialization. Static variables decorated with final are also not included, since final is allocated at compile time.
-
Parsing is about parsing fields, interfaces, and methods. Basically, the process of replacing symbolic references in a constant pool with direct references. A direct reference is a pointer to a target, relative offset, and so on.
-
Finally, initialization: mainly completes the static block execution and static variable assignment. This is the last stage of class loading. If the parent class of the loaded class is not initialized, the parent class is initialized first.
Initializations occur only when the Class is actively used, when an instance of the Class is created, when static methods or variables of the Class are accessed, when class.forname () reflects the Class, or when a subclass is initialized.
The life cycle of a class is from the loading of the class to the creation and use of the class instance to the unloading and collection of class objects when they are no longer in use. Note that classes loaded by the Java VIRTUAL machine’s three built-in class loaders are not unloaded during the lifetime of the virtual machine. Only classes loaded by user-defined class loaders can be unloaded.
2. Explain the class loader
Java provides three types of loaders: bootstrap startup class loader, extension class loader, and application loader, also known as system loader. The orange text on the right of the figure represents the corresponding loading directories of various loaders. Start the class loader to load classes from lib in Java Home, the extension loader to load classes from Ext, and the application loader to load classes from classpath.
In addition, you can customize class loaders.
Java class loading uses parental delegation, in which a class loader delegates the request to its parent class loader when loading a class. If the parent class loader still has a parent, it continues delegating until the top level starts the class loader, as shown in the blue up arrow. If the parent class loader is able to complete the class load, it returns successfully, and if the parent class loader is unable to complete the load, the child loader will attempt to load itself.
One benefit of this parent delegate pattern is that it avoids reloading classes and also prevents the Java core API from being tampered with.
Iii. Other knowledge sorting
1. Explain generational recycling in detail
As mentioned earlier, Java’s heap memory is managed in generations, mainly for garbage collection, based on the fact that first, most objects are soon no longer used, and second, some objects are not immediately useless, but do not last very long either.
Virtual machines are divided into young generation, old generation, and permanent generation.
1 > Young generation: mainly used to store newly created objects. The young generation is divided into Eden area and two Survivor areas. Most objects are generated in the Eden zone. When Eden zone is full, the surviving objects will be saved alternately in two Survivor zones, and the objects that reach a certain number of times will be promoted to the old age.
2 > Old age: used to store objects that are promoted from the younger generation and survive for a long time.
3 > Permanent generation: mainly stores class information and other content. Here, the permanent generation refers to the object division method, not specifically referring to permGen 1.7, or metaspace 1.8.
The JVM provides different garbage collection algorithms based on the characteristics of the younger generation and the older generation. Garbage collection algorithms can be divided into reference counting, copy, and tag removal by type.
Reference counting method determines whether an object is used or not by the number of times it is referenced, but it cannot solve the problem of circular reference.
The replication algorithm requires two memory Spaces of the same size: FROM and TO. When objects are allocated, they are only allocated in the FROM block. When objects are reclaimed, the surviving objects are copied into the TO block and the FROM block is emptied. The disadvantage is low memory usage.
The algorithm can be divided into two stages: marking objects and clearing objects that are not in use. The disadvantage of the algorithm is that memory fragmentation can be generated.
The Serial, ParNew, and Parallel Avenge algorithms available in the JVM are replication algorithms, while CMS, G1, and ZGC are marker scavenging algorithms.
In this article, these algorithms will not be expanded
Summary: Interview research points and bonus points
1. Jvm-related interview research points
First, you need a memory model for the JVM and a memory model for Java.
Secondly, to understand the loading process of the class, understand the parent delegation mechanism;
Thirdly, we should understand the visibility of memory and the guarantee mechanism of atomicity, visibility, and order by Java memory model.
Fourth, it is necessary to understand the characteristics, execution process and application scenarios of common GC algorithms. For example, G1 is suitable for occasions requiring maximum latency, and ZGC is suitable for large memory services of 64 systems.
Fifth, be aware of the JVM parameters in common use and understand how adjustments to different parameters affect and apply to what scenarios. Examples include the number of concurrent garbage collection, bias lock Settings, and so on
2. Related bonus points
Pay attention to these bonus points if you want to make a better impression on your interviewer:
-
First of all, an in-depth understanding of compiler optimization will show the interviewer that you have a desire for technical depth. For example, know how to use stack allocation to reduce GC stress when programming, how to write code suitable for inline optimization, etc.
-
Secondly, it’s better if you have some experience or ideas on how to deal with real problems. Interviewers like students with hands-on skills. For example, we solved the problem of full GC and checked the problem of memory leakage.
-
Third, having JVM optimization practices or ideas tailored to specific scenarios can have surprising effects. For example, in the scenario of high concurrency and low delay, how to adjust GC parameters to minimize THE GC pause time, and how to improve the throughput rate as much as possible for the queue processor;
-
Fourth, it will also impress your interviewer if you are aware of the latest JVM technology trends. For example, understand the efficient implementation principle of ZGC, understand the characteristics of Graalvm, etc.
In short, master the above specific JVM test points, in order to respond to the interview freely. I hope that after reading this article, you can be prepared to get the desired Offer.