In-depth understanding of the JVM ii class I loader subsystem
preface
The contents of the Class file are parsed into a format that the JVM can recognize and run by loading, linking, initializing, and so on.
Class loader subsystem
- The Class loading subsystem is responsible for loading Class files from the file system or network. Class files have specific file identifiers at the beginning of the file.
- The ClassLoader is only responsible for loading the class file, and whether it can run is determined by the Execution Engine
- The loaded class information is stored in a piece of memory called a method area. In addition to Class information, the method area stores runtime constant pool information, possibly including string literals and numeric constants (this constant information is a memory map of the constant pool portion of the Class file)
The life cycle of class loading
Class loader subsystem stage diagram
1. The load – loading
The JVM loads class files on demand. Class objects are loaded into memory on demand, using the parent delegate model
-
Gets the binary byte stream that defines a class by its fully qualified class name;
-
Convert the static storage structure represented by this byte stream into runtime data in the method area, namely instanceKlass instances, which are stored in the method area;
-
An in-memory java.lang.Class object representing this Class is generated as an access point to the various data of this Class in the method area, namely the instanceMirrorKlass object.
Start the classloader (BootstrapClassLoader) -BootstrapClassLoader
-
Implemented in C++ and implemented within the JVM
-
Loading core library (jre/lib/rt. Jar, resources. The jar, sum. The boot. The class. The path under the path of class library)
-
No inheritance of java.lang.ClassLoader, no parent loader
-
For security reasons, load only classes that start with fully qualified class names such as Java, Javax, sum, etc
-
We cannot get the boot class loader, null
Extended class loader -ExtClassLoader
-
Java language implementation,
-
Load the extension library (jre/lib/ext package). If the jar package created by the user is placed in this directory, it will also be loaded. Or load the directory specified by the java.ext.dirs system property
-
Derived from ClassLoader, the parent ClassLoader is the boot ClassLoader
Application class loader (System class loader) -AppClassLoader
-
Java language implementation
-
Load the environment variable classpath or the system property java.class.path to the class library specified in the path
-
Derived from ClassLoader, the parent ClassLoader is the extension ClassLoader
Custom class loaders
-
Java language implementation
-
A custom ClassLoader is implemented by inheriting the abstract java.lang.ClassLoader class
-
It is not recommended to override the loadClass method by overriding the findClass method
-
Custom class loaders without complex logic can inherit URLClassLoader classes, avoiding the need to write findClass methods and get bytecode streams
Parent delegation mechanism
define
If a class loader receives a request to load a class, the class loader does not load the class. Instead, it delegates the request to the parent class loader, at every level of class loaders.
So all class loading requests are eventually sent to the top of the launcher class loader;
Only when the parent class loader fails to find the desired class in its search scope and reports the result back to the subclass loader will the subclass loader attempt to load it itself.
role
-
Avoid class reloading
-
Protect program security from tampering with core apis (sandbox security mechanism)
Break the parent delegate scenario
SPI (JDBC Application)
The JDBC Driver interface is defined in the JDK, and its implementation is provided by various database service providers, such as the MySQL Driver package. The BootStrap class loader is used to load classes that implement the DriverManager interface, but DriverManager is installed in JAVA_HOME jre/lib/rt.jar, which is loaded by the BootStrap class loader. The implementation class of the Driver interface is in the Jar package provided by the service provider. According to the class loading mechanism, when the loaded class references another class, the VIRTUAL machine will use the class loader that holds the first class to load the referenced class. This means that the BootStrap class loader also loads the implementation class of the Driver interface in the JAR package.
As we know, the BootStrap class loader is only responsible for loading all classes in jre/lib/rt.jar in JAVA_HOME by default, so the Driver implementation needs to be loaded by the subclass loader, which breaks the parental delegation mechanism.
This subclass loader is the Thread context loader obtained from Thread.currentThread().getContextClassLoader(). When sun.misc.Launcher is initialized, the AppClassLoader is taken and set as the context classloader, so the thread-context classloader is the application classloader by default.
The parent class has already been delegated, but a third party class is required to load the application class from the thread context.
Tomcat
Each Tomcat webappClassLoader loads a class file in its own directory and does not pass it to the parent class loader. For the following three purposes:
-
For classes and libs in individual WebApps, they need to be isolated from each other so that libraries loaded in one application don’t affect the other, and for many applications, they need to have shared liBs so they don’t waste resources.
-
Same security issues as the JVM. Use a separate classloader to load Tomcat’s own libraries to avoid other malicious or unintentional damage.
-
Hot deployment. You must be amazed that Tomcat automatically reloads the class library without restarting the file.
The orange part is the parent delegate mechanism as before. The yellow part is the custom class loader for the first part of Tomcat, which mainly loads classes in the Tomcat package. This part still uses the parent delegate mechanism
The green part is the second part of Tomcat’s custom class loader, which breaks the parent delegate mechanism for classes. First let’s take a closer look at tomcat’s custom class loader
- The orange class loader still uses parental delegation because it only has one copy. If there is duplication, this copy shall prevail.
-
CommonClassLoader: Tomcat’s most basic class loader. Classes in the loading path can be accessed by the Tomcat container itself and various WebApps.
-
CatalinaClassLoader: private class loader in the Tomcat container. Classes in the loading path are not visible to WebApp
-
SharedClassLoader: a class loader shared by webapps. The classes in the load path are visible to all WebApps, but not to the Tomcat container.
2. The green part is the class loader automatically generated by Tomcat when a Java project forms a WAR package. In other words, tomcat automatically generates a class loader for each project that forms a WAR package, specifically for loading the war package. This class loader breaks the parent delegate mechanism. We can imagine adding this WebApp classloader without breaking the parent delegate mechanism.
If it doesn’t, it delegates to the parent class loader, and once it’s loaded, the subclass loader doesn’t have a chance to load again. Then the environment will be polluted. So, this is the part where he breaks the parental delegation mechanism
In this way, the WebApp class loader can load the class file in the war itself without having to ask the parent to load it. Of course, the other project files, or to entrust the superior load.
Both objects have the same condition
-
Package name + class name
-
Use the same classloader add-on
Sandbox security mechanism
Custom String class, but when loading the children of the String class to use the bootstrap loader first load, Jar (Java \lang\String.class). The error message says that there is no main method because rt.jar (Java \lang\ string. class) is loaded. This ensures the protection of the Java core source code, which is called sandbox security
2. The link – linking
Links consist of three phases: validation — preparation — parsing
Validation – verify
-
To ensure that the byte stream in the Class file meets the requirements of the CURRENT VM, ensure that the loaded Class is correct and does not harm vm security.
-
It mainly includes four kinds of verification, file format verification, metadata verification, bytecode verification, symbol reference verification.
File format verification:
-
Magic number check: starts with magic number 0xCAFEBABE.
-
Version check: Check whether the major and minor versions are within the processing range of the current VM.
-
If there are unsupported constant types in the constant pool (check the constant tag flag).
-
Is there any index value that points to a constant that does not exist or that does not conform to a type?
-
CONSTANT_Utf8_info whether there is data in the constant that does not conform to UTF8 encoding.
-
Whether any other information has been deleted or added to parts of the Class file and the file itself.
-
The length of the check
Metadata validation:
-
Is there a parent class
-
Whether the class inherits the final modifier
-
Whether the abstract method has an implementation
Bytecode verification:
The main purpose is to determine that program semantics are legitimate and logical through data flow and control flow analysis. In this phase, the method body of the class is verified and analyzed to ensure that the methods of the verified class do not generate events that harm VM security during running. For example:
-
Ensure that the data type of the operand stack and the sequence of instructions work together at any time: for example, do not place an int in the operand stack and load it into the local table as long when used.
-
Ensure that jump instructions do not jump to bytecode instructions outside the method body.
-
Ensure that type conversions in the method body are valid: for example, it is possible to assign a subclass object to a superclass data type, but it is dangerously illegal to assign a superclass object to a subclass data type, or even to a completely unrelated data type that has no inheritance relationship.
(Halting Problem: I stumbled through program logic to check whether the program ended during the critical window.)
Symbolic reference validation:
Symbolic reference validation can be thought of as a class checking the match of information outside of itself (various symbolic references in the constant pool). It usually needs to check the following:
-
Whether a class can be found for a fully qualified name described by string in a symbol reference.
-
Whether a field descriptor that matches a method exists in the specified class and the methods and fields described by the simple name.
-
The accessibility of classes, fields, and methods in symbolic references (private, protected, public, default) is accessible to the current class.
Indirect references: references to run-time production pools, symbolic references, such as #32, which correspond to a string with the fully qualified name of a class;
Direct reference: the memory address of the class, no longer pointing to the constant pool
Prepare – prepare
-
Allocates memory for class static variables and sets the default initial value of the class variable, which is zero (0 false, 0L, NULL…). (Static modified class variable)
-
Static constant that final modifies, because final is assigned at compile time. The ConstantValue property is added at compile time, and is initialized explicitly during preparation. Instead of assigning a default zero value, it is directly initialized to the specified value
-
Class variables are allocated in the method, and instance variables are allocated in the Java heap along with the object.
class MyObject {
static int num1 = 100;
static int num2 = 100;
static MyObject myObject = new MyObject();
public MyObject(a) {
num1 = 200;
num2 = 200;
}
@Override
public String toString(a) {
return num1 + "\t"+ num2; }}class MyObject2 {
static int num1 = 100;
static MyObject2 myObject2 = new MyObject2();
public MyObject2(a) {
num1 = 200;
num2 = 200;
}
static int num2 = 100;
@Override
public String toString(a) {
return num1 + "\t"+ num2; }}public class ClassLoadingProcessStatic {
public static void main(String[] args) { System.out.println(MyObject.myObject); System.out.println(MyObject2.myObject2); }}Copy the code
Results output
200 200
200 100
Copy the code
First output:
Preparation stage:
Num1:0
Num2:0
MyObject: null
Initialization phase:
Num1:0 is set to 100
Num2:0 is set to 100
MyObject: null becomes a memory address while executing constructor (init()) : num is set from 100 to 200; Num2 is set from 100 to 200
So the final output: 200, 200
Second output:
Preparation stage:
Num1:0
MyObject: null
Num2:0
Initialization phase:
Num1:0 is set to 100
MyObject: null becomes a memory address and executes the constructor num1 from 100 to 200; Num2 the value of 0 is 200. Num2:200 Is 100
So the final output: 200, 100
Note during the second output initialization phase:
Here both myObject and num2 are static variables, and the initialization phase statements are included in the clinit () method, i.e
/ / pseudo code
clinit(){
static MyObject2 myObject2 = new MyObject2();
static int num2 = 100;
}
Copy the code
Static int num2 = 100; static int num2 = 100; Execute the linit () method statements in sequence
Resolution to resolve
-
The process of converting symbolic references in a constant pool to direct references. A direct reference yields an in-memory pointer or offset to a class, field, or method
-
In fact, parsing operations are often performed with the JVM after initialization
-
A symbolic reference is a set of symbols that describe the referenced object. The literal form of symbol application is clearly defined in the Class file format of the Java Virtual Machine Specification. A direct reference is a pointer to a target directly, a relative offset, or a handle to the target indirectly.
-
Parsing actions are for classes or interfaces, fields, class methods, interface methods, method types, and so on. CONSTANT_Class_info, CONSTANT_Fieldref_info, and CONSTANT_Methodref_info in the constant pool.
3. Initialize – Initialization
-
The initialization phase is the execution of the class constructor method Clinit (). This method does not need to be defined and is an assignment statement that the Javac compiler automatically collects all static members of a class in a class (static variables; Constant + values to be computed) and statements in static code blocks.
-
Constructor methods execute instructions in the order in which statements appear in the source file. Clinit is different from the constructor of a class
-
If the class has a parent class, the JVM guarantees that the parent class’s Clinit () has finished before the subclass’s Clinit () is executed. The virtual machine must ensure that a class’s Clinit () is loaded synchronously in multiple threads.
-
The invocation of the Clinit () method, the initialization of the class, internally ensures the virtual machine’s security in its multithreaded environment
The plug-in
- bytecode viewer
View bytecode
- jclasslib bytecode viewer
A tool that visualizes compiled Java class files and the bytecode they contain. In addition, it provides a library that lets developers read and write Java class files and bytecodes.
Clinit () related interpretation
Clinit () thread safety
To invoke the clinit() method, the virtual machine must ensure that a class’s Clinit () is loaded synchronously in multiple threads.
public class DeadThreadTest {
public static void main(String[] args) {
Runnable r = () -> {
System.out.println(Thread.currentThread().getName() + "Start");
DeadThread dead = new DeadThread();
System.out.println(Thread.currentThread().getName() + "The end");
};
Thread t1 = new Thread(r,Thread 1 "");
Thread t2 = new Thread(r,Thread 2 ""); t1.start(); t2.start(); }}class DeadThread{
static{
if(true){
System.out.println(Thread.currentThread().getName() + "Initialize current class");
while(true){}}}}Copy the code
Instead of terminating, one thread is always waiting for another thread to finish loading, and the output is as follows
thread1To start a thread2To start a thread1Initialize the current classCopy the code
The Java compiler does not generate scenarios for the Clinit () method
-
Scenario 1: The Clinit () method is not generated for non-static fields, regardless of whether an explicit assignment is made
-
Scenario 2: Static fields with no explicit assignment and no clinit() method generated
-
Scenario 3: For example, the Clinit () method will not be generated for fields declared as static final primitive data types, regardless of whether an explicit assignment is made
public class ClinitTest2 {
/** * Where the Java compiler does not generate the
() method */
// Scenario 1: The
() method is not generated for non-static fields, regardless of whether an explicit assignment is made
public int num = 1;
// Scenario 2: Static fields, no explicit assignment, no
() method generated
public static int num1;
// Scenario 3: For example, the
() method will not be generated for fields declared static final base datatypes, regardless of whether an explicit assignment is made
public static final int num2 = 1;
The clinit() method generates static member variables or static code blocks
/* public static int num3 = 1; Static {system.out.println (" load me up "); } * /
public static void main(String[] args) {
ClinitTest2 clinitTest2 = newClinitTest2(); }}Copy the code
Conclusion:
- The case of assignment in the preparation part of the link phase:
-
For fields of primitive data types, if static final modifiers are used, explicit assignment (assigning constants directly rather than calling methods) is usually done during the preparation phase of the link phase
-
In the case of strings, explicit assignments are usually made during the preparation of the link phase if literal assignments are used and static final decorations are used
- Assignment in the initialization phase ()
- Except for the case above in the preparation section of the assignment
- Final Conclusion:
-
Explicit assignments that use static + final and do not involve method or constructor calls to primitive data types or strings are done in the preparation part of the link phase
-
Static variables decorated with static are assigned by Clinit () during initialization
public int aa = 2; //
public Integer aa2 = 2; //
public Integer AA3 = Integer.valueOf(1000); //
private static int num = 1; // Assign (static variable) in the initialization phase clinit()
public static final String s0 = "helloworld0"; // Assign (constant) in preparation for the link phase
That is, where the final variable is used, the constant is accessed directly and does not need to be determined at runtime.
public static final String FINAL_STR = "aaa";// Assign (constant) in preparation for the link phase
public static final int INT_CONSTANT = 10; // Assign (constant) in preparation for the link phase
public static final Integer INTEGER_CONSTANT1 = Integer.valueOf(100); // Assign in the initialization phase clinit() (constant + requires evaluation)
public static Integer INTEGER_CONSTANT2 = Integer.valueOf(1000); // Assign (static variable) in the initialization phase clinit()
public static final String s1 = new String("helloworld1"); // Assign in the initialization phase clinit() (constant + requires evaluation)
static {
num = 2;
number = 20;
System.out.println(num);
System.out.println("Load me up");
//System.out.println(number); // Error: Illegal forward reference
}
As long as the static code block does not use number, this can be defined later
// In the link preparation stage: number = 0 --> in the initialization stage: first 20 --> last 10
private static int number = 10;
public static void main(String[] args) {
System.out.println(ClassInitTest.num);/ / 2
System.out.println(ClassInitTest.number);/ / 10
System.out.println(ClassInitTest.FINAL_STR);//aaa
}
Copy the code
Execution order (static code block + construction code block + constructor)
public class Parent {
// Initialize clinint() run once
static {
System.out.println("Parent-static code block");
}
// instantiate the runtime call, each time the instantiate call
{
System.out.println("Parent-construct code block");
}
// call when instantiated
public Parent(a) {
System.out.println("Parent-constructor"); }}public class Son extends Parent {
static {
System.out.println("Sub-static code block");
}
{
System.out.println("Sub-construct code block");
}
public Son(a) {
System.out.println("Sub-constructor");
}
public static void main(String[] args) {
Son zi = new Son();
Son zi2 = newSon(); }}Copy the code
The output
Parent - Static code Block child - Static code Block Parent - Constructor Code Block Parent - Constructor code Block Parent - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block child - Constructor code block childCopy the code
Conclusion:
-
Parent static code block child Static code block Parent Construction code block/parent constructor > child construction code block/child constructor
-
Static code block > Mian Method > Construct code Block > Constructor
-
The static code block is executed only once. The construction block is executed each time an object is created.
Get the class loader
public static void main(String[] args) {
// Get the application class loader
ClassLoader systemClassLoader = ClassLoader.getSystemClassLoader();
System.out.println(systemClassLoader);//sun.misc.Launcher$AppClassLoader@18b4aac2
// Get its upper layer: extension class loader
ClassLoader extClassLoader = systemClassLoader.getParent();
System.out.println(extClassLoader);//sun.misc.Launcher$ExtClassLoader@1540e19d
// The boot class loader is not available
ClassLoader bootstrapClassLoader = extClassLoader.getParent();
System.out.println(bootstrapClassLoader);//null
// For user-defined classes, the system class loader is used for loading by default
ClassLoader classLoader = ClassLoaderTest.class.getClassLoader();
System.out.println(classLoader);//sun.misc.Launcher$AppClassLoader@18b4aac2
// The String class is loaded using the bootstrap classloader. --> Java's core class libraries are loaded using the boot class loader.
ClassLoader classLoader1 = String.class.getClassLoader();
System.out.println(classLoader1);//null starts the classloader for null
//1. Get the loader for the current class
ClassLoader classLoader = Class.forName("java.lang.String").getClassLoader();
ClassLoader classLoader3 = String.class.getClassLoader();
System.out.println(classLoader);
System.out.println(classLoader3);
//2. Get the loader for the current thread context
ClassLoader classLoader1 = Thread.currentThread().getContextClassLoader();
System.out.println(classLoader1);
//3. Get the application class loader
ClassLoader classLoader2 = ClassLoader.getSystemClassLoader().getParent();
System.out.println(classLoader2);
// Get the API path that BootstrapClassLoader can load
URL[] urLs = sun.misc.Launcher.getBootstrapClassPath().getURLs();
for (URL element : urLs) {
System.out.println(element.toExternalForm());
}
System.out.println("*********** Extended class loader *************");
// Gets the path to the API that the extension class loader can load
String extDirs = System.getProperty("java.ext.dirs");
for (String path : extDirs.split(";")) { System.out.println(path); }}Copy the code