03. What do you know about the JVM class loading mechanism?
The entire life cycle of a class from the time it is loaded into the memory of the virtual machine to the time it is unloaded includes: There are seven stages: Loading, Verification, Preparation, Resolution, Initialization, Using, and Unloading. Among them, verification, preparation and parsing are collectively referred to as Linking, and the occurrence order of these 7 stages is shown as follows:
Load, validation, preparation, initialization and unload the order of the five stages is certain, the type of loading process must, in accordance with the order, step by step while parsing stage does not necessarily: in some cases it can start again after the initialization phase, this is in order to support the Java language runtime binding features (also called dynamic binding or late binding).
1. Class loading time
Consider: Under what circumstances does the JVM load a class?
The First step is to load a class, so when we run a class in IDEA or directly (such as first.java), we actually start the JVM process, and then the JVM loads the bytecode of the class (first.class) into memory through the class loader. Then call the main method to start execution. If the code in the main method is:
public class First {
public static void main(String[] args) {
// Create an instance of the Second class
Second second = newSecond(); }}Copy the code
The JVM checks to see if there are any objects of the class in memory, and fires if there are noneClass loaderLoad diskSecond.classBytecode to memory, as shown below:
Ii. Class loading stage
Loading is a phase of the Class Loading process. During the load phase, the virtual machine needs to do three things:
(PS: method area is understood as an area of JVM memory)
1) Get the binary byte stream that defines a class through its fully qualified name.
- Read from a ZIP package (jar, WAR, EAR, etc.)
- From the network, such as a Web Applet
- Dynamic production at runtime – Dynamic proxy technology
- Produced by other files, such as JSP applications
- Reading from a database (rare)
- It can be obtained from an encrypted file
2) Convert the static storage structure represented by the byte stream into the runtime data structure of the method area.
We load the bytecode of the class into the method area, internally using C++ instanceKlass to describe the Java class, and its important fields are:
- The _java_mirror is a Java class mirror. For example, in the case of String, it is string.class, which exposes klass to Java
- _super the parent class
- _fields is a member variable
- _methods method namely
- _constants is a constant pool
- _class_loader is the class loader
- _vtable Indicates the table of virtual methods
- _ITABLE Interface method table
- If the class has a parent class that is not loaded, load the parent class first
3) Generate a java.lang.Class object that represents the Class in memory, and use it as an entry point for all data about the Class.
After the loading phase, the external binary byte stream is stored in the method area according to the format set by the virtual machine. (instanceKlass) The data storage format in the method area is completely defined by the virtual machine implementation. The Java Virtual Machine Specification does not specify the specific data structure in this area. Once the type data is properly located in the method area, a java.lang.Class object is instantiated in the Java heap memory,
This object will act as the external interface for the program to access the type data in the method area. Stages of loading and the connection part of the action, such as part of the bytecode file format validation action) was performed by cross loading phase is not yet complete, may have begun connection phase, but the clip in the middle of the stage of loading action, still belongs to the part of connection phase, the two stages of start time remains fixed order.
The connection stage of the class
The connection stage includes: verification, preparation and initialization. For these three stages, it is not necessary to study the details in depth. There are many details here and they are very tedious.
3.1 Verification Phase
Validation is the first step in the connection phase. The purpose of this phase is to ensure that the information contained in the byte stream of the Class file meets all the constraints of the Java Virtual Machine Specification and that the information is run as code without compromising the virtual machine itself.
There are four types of validation: file format validation (CAFEBABE), metadata validation, bytecode validation, and symbol reference validation.
In short, whether our [.class] file complies with the JVM specification and whether it has been tampered with, otherwise the JVM cannot execute the bytecode file.
3.2 Preparation
The preparation phase is the phase where memory is formally allocated for variables defined in the class (that is, static variables, variables that are modified by static) and initial values are set for the class variables
-
Static variables are stored at the end of instanceKlass before JDK 7, starting with JDK 7, at the end of _java_mirror
(In other words: before 1.7 stored in method area, after 1.7 stored in heap memory)
-
Static variable allocation and assignment are two steps. Allocation (memory allocation) is done in the preparation phase and assignment is done in the initialization phase
There are two other confusing concepts to emphasize about the preparation phase. The first is that memory allocation only includes class variables, not instance variables, which are allocated in the Java heap along with the object when it is instantiated.
Second, the initial value here is “normally” a null value of the data type. Suppose a class variable is defined as:
public static int value = 123; Copy the code
The initial value of the variable value after the preparation phase is 0, not 123, because no Java methods have yet been executed. The putstatic instruction assigning value to 123 is stored in the constructor () method after the program is compiled. So the action of assigning value to 123 will not be performed until the class is initialized.
-
If the static variable is a final primitive type, as well as a string constant, then the value is determined at compile time and assignment is done at prepare time
The above mention that the initial value is zero in the “normal case” implies that there are certain “special cases” that are relative: If a class field has a ConstantValue attribute in the field property sheet, then the value of the variable will be initialized to the initial value specified by the ConstantValue attribute in the preparation phase. Suppose the definition of the class variable value is changed to:
public static final int value = 123; Copy the code
Javac will generate a ConstantValue attribute for value at compile time, and the vm will assign value to 123 based on the ConstantValue setting at the ready stage.
-
If the static variable is final but of a reference type, the assignment is also completed during the initialization phase
Test code:
public class Second {
static int a;
static int b = 10;
static final int c = 20;
static final String d = "hello";
User user = new User();
}
Copy the code
Decompiled code:
It can be seen that c and D are assigned at the preparation stage:
A, b, user completes assignment during initialization:
3.3 Parsing Phase
The process of replacing a symbolic reference in a constant pool with a direct reference.
Constant pool concept (understand in advance)
There are two types of constants in the constant pool: Literal and Symbolic References. Ff is a constant concept, such as text strings and constant values that are declared as final. Symbolic references belong to the concept of compilation principles, mainly including the following types of constants:
-
Packages exported or opened by modules
-
Fully Qualified Name of class and interface
-
Name of the field and Descriptor
-
The name and descriptor of the method
-
Method Handle, Method Type, Invoke Dynamic
-
Dynamic Call points and constants (Dynamically-Computed Call Site and Dynamically-Computed Constant)
Java code is dynamically linked when Javac is compiled and when the virtual machine loads the Class file. The Class file does not store the final layout information of various methods and fields in memory. The symbolic references of these fields and methods cannot be used by the virtual machine unless they are translated at runtime. When a virtual machine loads a class, it obtains the corresponding symbolic reference from the constant pool, which is then resolved and translated to a specific memory address at class creation or run time.
-
Symbolic References: A Symbolic reference uses a set of symbols to describe the referenced target. A symbol can be a literal in any form, as long as it can be used to locate the target without ambiguity. Symbolic references are independent of the memory layout implemented by the virtual machine, and the target of the reference is not necessarily the content already loaded into the virtual machine’s memory. The memory layout of the various virtual machine implementations can vary, but they must all accept the same symbolic references because the literal form of symbolic references is clearly defined in the Java Virtual Machine Specification’s Class file format.
-
Direct References: A Direct reference is either a pointer, a relative offset, or a handle that can indirectly locate a target. Direct references are directly related to the memory layout implemented by the virtual machine. The same symbolic reference generally translates into different direct references across different virtual machine instances. If there is a direct reference, the target of the reference must already exist in the memory of the virtual machine.
3.4 summary
Of these three stages, the most important one you should care about is the preparation stage
This phase allocates space to the loaded class and static variables, and assigns initialization values.