We know that the class file stores the class description information and various details of the data. When running Java programs, the VIRTUAL machine needs to load the class data into memory, and after verification, conversion, parsing and initialization, finally form the Java type that can be used directly.
From the moment a class is loaded into virtual machine memory to the moment it is unloaded, its entire life cycle includes seven stages: loading, validation, preparation, parsing, initialization, use, and unloading. The three parts of verification, preparation and resolution are called connection.
The loading mechanism of a class is actually five processes in the life cycle of a class: loading, verification, preparation, parsing and initialization.
loading
Loading is the first stage of the class loading process, during which the virtual machine needs to do three things:
- Get the binary byte stream that defines a class by its fully qualified name;
- Convert the static storage structure represented by this byte stream to the runtime data structure of the method area;
- Generates an in-memory representation of this class
java.lang.Class
Object that acts as an access point to the various data of the class called the method area.
Binary streams can be obtained by fully qualified names in many ways, such as from JAR, EAR, WAR file packages, from the network, generated by other files (JSP files generate corresponding Servlet classes), or even generated dynamically at runtime (Java dynamic proxies).
Compared to other phases of the class loading process, the loading phase is the most controllable. This can be done either with the system-provided bootstrap class loader or with custom class loading (overriding the loadClass method to control how the byte stream is retrieved).
A more detailed description of class loaders will come at the end of the article.
After the load phase is complete, the external binary byte stream is stored in the method area in the format required by the virtual machine. We then instantiate an in-memory object of the Java.lang. Class Class through which we can access the data in the method area.
validation
Validation is the first step in the connection phase, which ensures that the byte stream in the class file meets the requirements of the current virtual machine and does not compromise the security of the virtual machine itself. In the verification phase, the following four verification actions will be roughly completed: file format verification, metadata verification, bytecode verification and symbol reference verification.
- File format validation: Verifies that the byte stream complies with the Class file format specification and can be processed by the current version of the VIRTUAL machine. The main purpose of this validation phase is to ensure that the input byte stream is properly parsed and stored in the method area in a format that describes the information of a Java type. After passing the verification in this stage, the byte stream will enter the method area of memory for storage. The following three verification stages are all based on the storage structure of the method area, and will not operate the byte stream directly.
- Metadata validation: Semantic analysis of the information described by bytecode to ensure that the information described conforms to the requirements of the Java language specification. The main purpose is to perform semantic verification on the metadata information of the class to ensure that there is no metadata information that does not conform to the Java language specification.
- Bytecode verification: Verifies and analyzes the method body of a class to ensure that the methods of the verified class do not cause events that harm VM security during running.
- Symbol validation: Checks the match of information outside the class itself (various symbol references in the constant pool). This occurs when symbolic references are converted to direct references (in the parsing phase) to ensure that the parsing action is performed properly.
To prepare
The preparation phase is the phase where memory is formally allocated and initial values are set for class variables (static variables), and memory used by these class variables is allocated in the method area.
Two things to note here:
- Member variables are not allocated here; they are allocated in the heap when the class instantiates the object.
- Setting the initial value here refers to the zero value of the type (such as 0, NULL, false, etc.), not the assigned value displayed in the code.
Such as:
public class Test {
public int number = 111;
public static int sNumber = 111;
}Copy the code
The member variable number is not allocated and initialized at this stage. The class variable sNunber allocates memory in the method area and sets the value of 0 to int instead of 111, which is only executed during initialization.
Such as:
public class Test {
public static final int NUMBER = 111;
}Copy the code
At this point, the value of NUMBER is set to 111 in the preparation phase.
parsing
The parsing phase is the process by which the virtual machine replaces symbolic references in the constant pool with direct references.
- Symbolic reference: A symbolic reference describes the referenced object as a set of symbols, which can be any literal, as long as it is used to unambiguously locate the object.
- Direct reference: A direct reference can be a pointer to a target, a relative offset, or a handle that can be indirectly located to the target.
The main action of parsing is to search the constant pool for class or interface, field, class method, interface method, method type, method handle, call point qualifier and so on 7 symbolic references, and replace these symbolic references with direct references. The following mainly introduces the next class or interface, field, class method, interface method parsing:
- Class or interface resolution: Assume the current class
A
The class is referenced by the symbol XB
, the virtual machine will represent the classB
The fully qualified name of theA
Class loader to loadB
.B
After loading, validating, and preparing, it may be triggered during parsingB
The loading process of other referenced classes is equivalent to the recursive loading process of a class reference chain. As long as there are no exceptions in the whole process,B
Is a successfully loaded class or interface, that is, can get the representationB
thejava.lang.Class
Object. In the validationA
Have toB
The symbol reference X is replaced byB
Direct reference to. - Field resolution: Parsing unparsed fields begins by parsing symbolic references to the class or interface to which the field belongs. If the class itself contains a simple name and field description that matches the target field, the field reference is returned. If the interface is implemented, each interface and its parent interface will be recursively searched from bottom to top according to the inheritance relationship. If the interface contains a field whose simple name and field descriptor match the target, this field will be returned. If it inherits from another class, the parent class is recursively searched from bottom to top, and if the parent class contains a field whose simple name and field descriptor match the target, a direct reference to that field is returned.
- Class method resolution: Class method resolution is similar to field resolution in that it is searched from top to bottom based on inheritance and implementation relationships, but the class is searched first and the interface is searched later. If there is a field with a simple name and field descriptor that match the target, the field reference is returned.
- Method resolution of interfaces: Similar to class method resolution, search interfaces from top to bottom (interfaces have no parent classes, only parent interfaces are possible). If there is a field whose simple name and field descriptor match the target, the field reference is returned.
Initialize the
Class initialization The last step in the class loading process. In the previous section, except for the loading phase during which the developer can customize the loader, the rest of the action is completely controlled by the virtual machine. During the initialization phase, you actually start executing the Java code defined in the class.
In the preparation phase, class variables are set to system-required zero values, while in the initialization phase, class variables and other resources are initialized according to a programmer’s subjective program plan, or can be expressed another way: the initialization phase is the process of executing the class constructor < clinit > () method.
The < clinit > () method is generated by the compiler automatically collecting all of the class variables (static variables) and the statements in the static code block (static{} block) in the class. The order in which the compiler collects statements is determined by the order in which the statements appear in the source file. Only variables defined before the static code block can be accessed, and variables defined after it can be assigned to, but not accessed, the preceding static code block.
public class Test { static { number = 111; // Can assign system.out.println (number); Illegal forward reference} static int number; }Copy the code
Unlike the class’s constructor (or instance constructor < init > () method), the < clinit > () method does not explicitly call the parent class’s < clinit > () method, and the virtual machine guarantees that the parent class’s < Clinit > () method will complete before the subclass’s < Clinit > () method executes. Therefore, the static code block defined by the parent class is assigned to the child class first.
class Parent { public static int A = 1; static { A = 2; } } class Sub extends Parent { public static int B = A; public static void main(String[] args) { System.out.println(Sub.B); }}Copy the code
The < clinit > () method is not required for a class or interface, and the compiler may not generate a < clinit > () method for a class that has no static blocks and no assignment to variables.
Static blocks cannot be used in the interface, but there is still assignment for variable initialization, so the interface generates the < Clinit > () method just as the class does. But unlike classes, the < clinit > () method that executes the interface does not need to first execute the < clinit > () method of the parent interface. The parent interface is initialized only when a variable defined in the parent interface is used. In addition, the implementation class of the interface does not execute the interface’s < Clinit > () method when initialized.
Virtual opportunity to ensure that a class of < clinit > () method in a multithreaded environment is properly locking, synchronization, if multiple threads at the same time to initialize a class, so there will be only a thread to execute this kind of < clinit > () method, other threads need to be blocked waiting, until the active threads execute < clinit > () method. If you have a long operation in a class’s < clinit > () method, it can cause multiple processes to block.
Class loader
In the previous loading process, it was mentioned that the class loader gets the binary byte stream describing a class by its fully qualified name. This process allows the custom class loader under development to decide how to get the required byte stream. So, what is a class loader?
For any Java Class, the Class loader must be loaded into the method area and generate java.lang.Class objects to use the various functions of the Class, so we can think of the Class loader as a tool to convert Class files into Java.lang. Class objects.
For any class, the uniqueness of the Java virtual machine needs to be established both by the class loader that loads it and by the class itself. Each class loader has a separate class namespace. That is, if two classes are “equal,” they must be loaded by the same classloader in the same virtual machine and from the same class file.
In Java, there are already three prefabricated class loaders, namely BootStrapClassLoader, ExtClassLoader, and AppClassLoader.
- BootStrapClassLoader: starts the class loader, which is implemented by C++ and cannot be explicitly obtained in Java programs. It is responsible for loading classes stored in JDK\jre\lib(JDK stands for JDK installation directory, same below).
- ExtClassLoader: an extension class loader implemented by sun.misc.Launcher$ExtClassLoader that loads all libraries in the JDK\jre\lib\ext directory or in the path specified by the java.ext.dirs system variable. Developers can use it directly.
- AppClassLoader: An application class loader, implemented by sun.misc.Launcher$AppClassLoader, which loads classes specified by the user’s ClassPath. Developers can use this class loader directly. Typically, developer-defined classes are loaded by the application class loader.
ExtClassLoader is a class loader, but it is also a Java class that is loaded by BootStrapClassLoader, so the parent of ExtClassLoader is BootStrapClassLoader. But since BootStrapClassLoader is implemented in c++, we get null via extclassloader.getparent. Similarly, AppClassLoader is loaded by ExtClassLoader, and the parent of AppClassLoader is ExtClassLoader.
public class Test {
public static void main(String[] args) {
ClassLoader cl = Test.class.getClassLoader();
while(cl ! = null) { System.out.println(cl); cl = cl.getParent(); }}}Copy the code
Print result:
sun.misc.Launcher$AppClassLoader@232204a1
sun.misc.Launcher$ExtClassLoader@74a14482Copy the code
We can also define our own class loader, CustomClassLoader, and its parent must be AppClassLoader. This hierarchy of class loaders is called the parent delegate model.
Parental delegation model
The parent delegate model requires that all class loaders have their own parent class loaders, except for the top-level start class loaders. Here, the parent-child relationship between class loaders is not implemented as an inherited relationship, but instead both call the code of the parent loader recursively.
The working process of the parental delegation model is: If a classloader receives a classload request, it does not try to load the class itself at first. Instead, it delegates the request to the parent classloader. This is true at every level of classloaders, so all load requests should eventually be passed to the top level of the starting classloader. Only when the parent loader reports that it cannot complete the load request (it did not find the desired class in its search scope) will the child loader attempt to load it itself.
ClassLoader
protected Class<? > loadClass(String name, boolean resolve) throws ClassNotFoundException { synchronized (getClassLoadingLock(name)) { // First, checkifthe class has already been loaded Class<? > c = findLoadedClass(name);if (c == null) {
long t0 = System.nanoTime();
try {
if(parent ! = null) { c = parent.loadClass(name,false);
} else {
c = findBootstrapClassOrNull(name);
}
} catch (ClassNotFoundException e) {
// ClassNotFoundException thrown if class not found
// from the non-null parent class loader
}
if (c == null) {
// If still not found, then invoke findClass inorder // to find the class. long t1 = System.nanoTime(); c = findClass(name); // this is the defining class loader; record the stats sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0); sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1); sun.misc.PerfCounter.getFindClasses().increment(); }}if (resolve) {
resolveClass(c);
}
returnc; }}Copy the code
Check to see if it has already been loaded. If not, call the loadClass() method of the parent class loader and recurse up. If the parent class loader is empty, the recursion is to the start class loader. If all loaders from the parent class loader up to the starting class loader fail to load, you call your own findClass() method to load.
Using the parent delegate model enables Java classes to have a hierarchy of priorities along with the loader, ensuring that the same class is loaded only once, avoiding reloading, and preventing malicious replacement of system classes.
Custom class loaders
Typically, the parent delegate model is already implemented for the developer in the loadClass method of the ClassLoader method. When customizing the ClassLoader, you simply need to duplicate the findClass method.
public class CustomClassLoader extends ClassLoader { private String root; public CustomClassLoader(String root) { this.root = root; } @Override protected Class<? > findClass(String name) throws ClassNotFoundException { byte[] classData = loadClassData(name);if (classData == null) {
throw new ClassNotFoundException();
} else {
return defineClass(name, classData, 0, classData.length);
}
}
private byte[] loadClassData(String name) {
String fileName = root + File.separatorChar
+ name.replace('. ', File.separatorChar)
+ ".class";
try {
InputStream ins = new FileInputStream(fileName);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
int bufferSize = 1024;
byte[] buffer = new byte[bufferSize];
int length;
while((length = ins.read(buffer)) ! = -1) { baos.write(buffer, 0, length); }return baos.toByteArray();
} catch (IOException e) {
e.printStackTrace();
}
returnnull; }}Copy the code
Create a new class, com.xiao.U, compile it into a class file and put it on your desktop to test it out:
public class Test {
public static void main(String[] args) {
CustomClassLoader customClassLoader = new CustomClassLoader("C:\\Users\\PC\\Desktop");
try {
Class clazz = customClassLoader.loadClass("com.xiao.U"); Object o = clazz.newInstance(); System.out.println(o.getClass().getClassLoader()); } catch (ClassNotFoundException | IllegalAccessException | InstantiationException e) { e.printStackTrace(); }}}Copy the code
Print result:
CustomClassLoader@1540e19dCopy the code
Custom classloaders can be hot deployed on the server and hot updated on mobile devices such as Android.
Reference:
- An in-depth understanding of the Java Virtual Machine (Second edition)
- Java class loading mechanism in detail