As we know, Java programs need to be compiled into bytecode files by JavAC before being executed by virtual machine. Class loading refers to the process of reading compiled bytecodes (not just bytecodes in.class files; any stream of bytecodes can be read into the JVM) into the MEMORY of the JVM. The vm verifies, converts, parses, and initializes the. Class file when loading it. The result is a Java type that can be used directly by the virtual machine. This process is called the class loading mechanism of the virtual machine. Class loading is an important part of virtual machines, and it comes up frequently in interviews. Therefore, as a Java programmer, the JVM’s classloading mechanism is something we must learn.

To better understand the class loading process, let’s look at a class loading interview question:

// Person.java
public class Person{
    static{
        System.out.println("I'm a person"); }}// Stuent.java
public class Student extends Person{
    public static String indentity="Student";
    static{
        System.out.println("I'm a student"); }}public class Ryan extends Student{
    static{
        System.out.println("I'm Ryan"); }}Copy the code

Next we write a test class:

public class Test {
	public static void main(String[] args) {	
		System.out.println("Ryan.indentity="+ Ryan.indentity); }}Copy the code

Instead of looking at the answers below, try to analyze what the output will be.

Ok, publish the answer, the above code output is as follows:

I'm a persion
I'm a student
Ryan.indentity=Student
Copy the code

Is it the same as your answer? If not, you don’t have a clear understanding of the JVM’s classloading mechanism. Let’s take a look at class loading for the JVM.

First, the class loading process

2. Loading, Linking, Initialization, Using, and Unloading a class must follow a Loading, Linking, and Unloading cycle. The connection stage includes Verification, Preparation, and Resolution. As shown below:The order of the load, validate, prepare, initialize, and unload phases is fixed, while the parse phase is not necessarily so: it can begin after the initialization phase in some cases. Next, let’s take a closer look at the five stages of class loading in a Java virtual machine: loading, validation, preparation, parsing, and initialization.

1. Loading phase

The load phase is the first phase of the class loading process. At this stage the JVM passes the fully qualified name of the class (which may come from a.class file, ZIP, network, etc., or even runtime generation). To read the binary byte stream of the class. The read binary bytes are converted into a runtime data structure for the method area, and a java.lang.class object representing the Class is generated in memory.

In simple terms, this stage involves reading the bytecode binary stream of a Class into the JVM and generating a Class object that represents that Class.

2. The connection

The connection phase consists of three processes: validation, preparation and parsing.

(1) Verification

The purpose of this phase is to ensure that the Class file stream meets the requirements of the current virtual machine. Ensure that these data codes do not compromise VM security. In the verification phase, four stages of verification will be roughly completed: file format verification, metadata verification, bytecode verification and symbol reference verification.

File format validation

This phase verifies that the byte stream complies with the Class file format specification and can be processed by the current virtual machine.

Metadata validation

This stage is to carry out semantic analysis of the information described by bytecode to ensure that the information described conforms to the requirements of Java language specification.

Bytecode verification is a phase in which data flow and control flow are analyzed to determine whether the program semantics are legitimate and logical. In this phase, the method body of the class is checked and analyzed to ensure that the methods of the verified class do not do anything that endangers virtual machine security during runtime.

Symbolic reference validation The final phase of symbolic reference validation occurs when the virtual machine symbolic reference is converted to a direct reference, which occurs in the third phase of the connection, the parse phase. Symbolic reference checking can be regarded as checking the matching of information outside the class itself (various symbolic references in the constant pool).

(2) Preparation

The preparation phase is an important phase in the class loading mechanism. This stage is the formal allocation of memory and setting of initial values for static variables defined in a class (those modified static). The memory used by these variables should be allocated in the method area. As we know, prior to JDK7, the Hotspot VIRTUAL machine used persistent generation to implement the method area. After JDK8, the method area is placed in the Java heap. Therefore, Class variables are also stored in the Java heap along with Class objects.

Also, two things to note about the preparation phase:

Class variables are static variables. Class variables are static variables. Class variables are static variables. The allocation of memory in the preparation phase includes only class variables, not member variables, which are allocated to the Java heap along with the object when it is instantiated.

For example, the following code only allocates memory for value, not STR, during the preparation phase.

	public class Test {
		  public static int value = 123;
		  public  String str = "123";
	}
Copy the code

During the preparation phase, the JVM allocates memory for class variables and initializes them. The initialized value is not the value we assigned in the code, but the zero value of the data type. For example, after the preparation phase in the code above, the value of value is 0, not 123. But if you add a final modifier to value, the value will be 123 after preparation (because value is equivalent to a constant) because Javac generates a ConstantValue attribute for value at compile time, During the preparation phase, the vm assigns value to 123 based on the ConstantValue setting.

(3) Analysis

The parsing phase is the process by which the virtual machine replaces symbolic references to the constant pool with direct references. This stage is not very important, just understand.

3. The initialization

Initialization is the last and most important stage of class loading. This is where the user defined Java program code (bytecode) actually begins to execute. What does that mean? As mentioned earlier, the JVM assigns default initial values to class variables during the preparation phase, and class variables are assigned the values declared in the code during initialization. The JVM initializes class objects based on the order in which the statements are executed.

The Java Virtual Machine Specification does not enforce the conditions under which the first “load” phase of class loading begins, but it does impose strict restrictions on the initialization phase. Classload initialization is generally triggered by the JVM when it encounters the following six conditions (loading, validation, and preparation phases are performed before initialization) :

(1) When encountering four bytecode instructions such as new, getstatic, putstatic, and Invokestatic, if the class has not been initialized, it needs to trigger its initialization first. The most common Java code scenarios for generating these four instructions are when an object is instantiated using the new keyword, when static fields of a class (except those that are modified by final and have been put into the constant pool by the compiler) are read or set, and when static methods of a class are called.

(2) When a java.lang.reflect method is used to reflect a class, if the class has not already been initialized, it needs to be initialized first.

③ When initializing a class, if the parent class has not been initialized, the initialization of the parent class must be triggered first.

④ When the VM starts, the user needs to specify a main class (the class containing the main() method) to execute. The VM initializes the main class first.

5. When using JDK1.7 dynamic language support, if a Java lang. Invoke. The final analytical results REF_getstatic MethodHandle instance, REF_putstatic, REF_invokeStatic method handles, And the class corresponding to the method handle is not initialized, so it needs to be initialized first.

(6) When an interface defines a new default method in JDK8, the interface must be initialized before any implementation class of the interface is initialized.

Second, class loading sample analysis

After understanding the process of class loading, we analyze several examples to deeply understand the class loading.

1

Ryan. Indentity is a class variable in Ryan’s parent class Student. According to point ① in the initialization phase, we can know:

Class initialization is triggered when static fields of a class (except those that are modified by final and have been put into the constant pool by the compiler) are read or set, and when static methods of a class are called.

The Student class is loaded and initialized first (because indEntity is a static variable in the Student class, Ryan will not be loaded).

When initializing a class, if the parent class has not been initialized, the initialization of the parent class must be triggered first.

Therefore, the first class to be loaded and initialized should be the Person class, whose initialization results in the I’m a Person statement being printed first. The Student class is then loaded, so the second line prints I’m a Student. Print Ryance.indEntity =Student after the above class loading is complete

2. Sample 2

The Singleton class is given as shown below. Please analyze the output of the program. You can skip the answer and try to analyze the results yourself.

public class Singleton {
    private static Singleton singleton = new Singleton();
    public static int x;
    public static int y = 0;

    private Singleton(a) {
        ++x;
        ++y;
        System.out.println("Singleton constructor execution, x =" + x +",y = " + y);
    }

    public static void main(String[] args) {
        System.out.println("singleton.x = " + singleton.x);
        System.out.println("singleton.x = "+ singleton.y); }}Copy the code

Output result:

Singleton constructor executes, x =1,y = 1
singleton.x = 1
singleton.x = 0
Copy the code

This is an odd output if you don’t understand how class loading works. The initial value of x and y is 0, and the same “++” operation is performed in the constructor, but the final output is different. Let’s start with a few conditions for initialization of the start class.

1) According to the fourth condition that triggers class initialization, the class containing the main method is loaded first when the VM starts. So the Singleton first triggers the classloading process.

2) After the loading and verification process, the virtual machine enters the preparation stage of class loading, during which the virtual machine allocates memory and initializes and assigns values to class variables. Note that only default values are assigned to class variables during the preparation phase. After the preparation phase, the result is as follows:

public class Singleton {
    private static Singleton singleton = null;
    public static int x = 0;
    public static int y = 0;
}
Copy the code

3) The initialization phase assigns the values declared in the code to class variables according to the code order. Therefore, the Singleton is instantiated first and the instantiated value is assigned to the Singleton. At this point, x and y haven’t been assigned yet. So x and y are equal to 0. So, after the “++” operation, the output value of x and y is 1.

4) We then assign the values declared in our code to x and y. In our code, x is not assigned an initial value and y is assigned 0. Therefore, x is still 1 and y is assigned 0.

5) Print the values of x and y after class loading.

After analyzing the above two examples, I believe you have a better understanding of the JVM class loading mechanism. In addition to the process of class loading, there is also a very important knowledge, that is the class loader, let’s move on.

Class loaders

In the previous two chapters, we looked at the classloading process, which is performed by the classloader. Class loaders play a much more important role in Java programs than the class loading phase. Any class we use in our program needs to be loaded into the virtual machine by the class loader, and the class loader guarantees that the loaded class is unique. One thing we need to understand here is that two classes are equal only if they are loaded by the same classloader. If two classes come from the same Class file, but are loaded by different Class loaders in the same virtual machine, then the two classes must not be equal.

How does a virtual machine ensure that the same Class file can only be loaded by the same Class loader? The first step to answering this question is to understand the class loader division.

1. Classification of class loaders

Class loaders in Java are divided into Bootstrap Class Loader, Extension Class Loader, Application Class Loader and custom Class Loader The Class Loader). Let’s take a look at each of these loaders.

1) Bootstrap Class Loader

This class loader is part of the virtual machine and is implemented in C++. This class loader is only responsible for loading Java virtual machines that are recognized by the file name (rt.jar, tool.jar) stored in <JAVA_HOME>\lib, or in the path specified by the -xbootclasspath parameter. Libraries with incorrect names will not be loaded even if they are placed in the lib directory.

2) Extension Class Loader

This classloader is in the class Sun.miss.Launcher $ExtClassLoader and is implemented by Java code. It is responsible for loading all libraries in the <Java_HOME>\lib\ext directory, or in the path specified by the java.ext.dirs system variable. Developers can use the extended Class loader directly in their programs to load Class files.

3) Application Class Loader

The classloader is located in sun.misc.Launcher$AppClassLoader and is also implemented in the Java language. He is responsible for loading all the libraries on the user’s ClassPath. Developers can also use the class loader directly in their code. If there is no custom class loader in the program, this is generally the default class loader in the program.

4) Custom User Class Loader

In addition to the class loaders in the three Java systems mentioned above, in many cases users will also load the required classes through custom class loaders. Such as adding a Class file source other than the disk location, or isolating and reloading classes through Class loaders

2. Parental delegation model

At the beginning of this chapter we mentioned that two classes can only be equal if they are loaded by the same classloader. Given that there are so many class loaders in Java, how does Java ensure that the same class is loaded by the same class loader? This is largely due to the “parent delegate model” of class loaders. Let’s look at what the parental delegation model is.

The following figure illustrates the hierarchical relationship between class loaders, which is referred to in this section as the parent delegate modelThe parent delegate model requires that all class loaders have their own parent class loaders, except for the top start class loaders. In this case, the parent-child relationship between class loaders is not implemented through inheritance, but through a combined relationship to reuse the parent loader code.

The working process of the parental delegation model is as follows:

If a classloader receives a class-loading request, it first does not attempt to load the class itself, but delegates the request to the parent classloader, at every level of class loaders. As a result, all classloading requests are eventually passed to the topmost class loader, and the subclass loader attempts to complete the load only if the parent cannot find the requested class.

The code implementation for the parent delegate model is very simple, as follows:

protected synchronizedClass<? > loadClass(String name,boolean resolve)throws ClassNotFoundException {
        // First check to see if the type is already loaded
        Class c = findLoadedClass(name);
        if (c == null) { // If the class is not already loaded, try to load the class
            try {
                if(parent ! =null) { // Delegate to the parent class loader if there is one
                    c = parent.loadClass(name, false);
                } else { // If there is no parent class loader, try using the boot class loaderc = findBootstrapClass0(name); }}catch (ClassNotFoundException e) {// The parent class loader cannot find the class to load, throws a ClassNotFoundException
                // Try calling your own findClass method for class loadingc = findClass(name); }}if (resolve) {
            resolveClass(c);
        }
        return c;
    }
Copy the code

The logic of this code is very clear and easy to understand, the code has done a detailed note. Because the parent delegate model has a hierarchical relationship with priority, any class load will eventually be delegated to the top start class loader, thus ensuring the same class in each class loader environment of the program.

Four, summary

Class loading is often a nightmare for many candidates. As you’ll see in this article, class loading is a very simple process, and you’ll be able to navigate most of the questions by remembering the important stages of class loading. Class loaders are also important. It’s important, but simple, to implement the process of class loading in just a few lines of code through the parent delegate model. We have to admire the sophistication of Java design.

5. Reference & Recommended reading

In-depth Understanding of the Java Virtual Machine third edition by Zhiming Zhou

Two interview questions will explain the Java class loading mechanism