This chapter continues with the JVM: a concise analysis of class files, summarized in Chapter 7 of Into the JVM Virtual Machine, 3rd Edition. First, the Class file itself is “static”. Unlike other languages that are concatenated at compile time, the JVM only reads “static” Class files into memory through Class loading at run time.

There are a few “inconveniences” associated with this approach, such as using symbolic references to “space” at compile time and replacing them with direct references at run time, adding a little extra overhead both at compile time and the first class load. However, it is this feature that gives Java great flexibility – “play it by the rules”, dynamic assembly. Typical examples include applets, JSPS, and so on.

Having explored the general structure of a Class file, the next task is to explore how the JVM loads classes into memory (i.e., the full details of “Class loading”). There are two core issues to discuss:

  1. When is class loading triggered? ( When )
  2. Class loading process? ( How )

The other two premises are:

  1. In this article, the term “Class file” will not only refer to a.class file stored narrowly on disk. It is essentially binary code that conforms to a specific format. The source can be the network, a database, memory, or even another running program, so it may also be referred to as “binary stream”.

  2. By default, the term “class” in this article includes both generic classes and interfaces. When the two need to be distinguished, I will note additional instructions.

1. When to load the class — When

The seven stages of class loading are shown below:

Note that the process of Class Loading includes a stage called Loading, not to be confused. The loading phase is related to the class loader;

Validation, preparation and parsing can be collectively referred to as the Linking stage;

Of these seven phases, five begin in strict order: load, validate, prepare, initialize, and unload. Because these phases usually intersect (for example, activating another phase during execution of one phase), there is no guarantee that they will end exactly in that order.

1.1 Timing of Initialization

The Java Virtual Machine Specification does not specify exactly when the first stage of class loading, load, takes place, but it does specify that there are six cases (the specification uses such strict adjectives) when initialization must begin. In layman’s terms, this is when a class is called while Java is running:

  1. There are four bytecode instructions involving the class, which has not yet been initialized:
  • newDirective: creates an instance of the class;
  • putstatic.getstaticInstruction: static fields (read, write) about the class are called;
  • invokestaticDirective: calls a static method on the class;
  1. A reflection call is made to the class using the java.lang.Reflect package and the class is not initialized;

  2. Before initializing a class, initialize its parent class, which is not initialized. (This is not necessarily true for interfaces. Interface inheriting parent interfaces can be deferred to initialization until their content is actually called.

  3. When the virtual machine starts, the class hosting the main program (public static void main method) (called “main class”) needs to be initialized first.

5*) After JDK 8, interfaces are allowed to carry default methods. If an implementation class implements such an interface, the interface is initialized before calling its implementation class.

6 *) Java. Lang. Invoke. Analytical results for REF_getStatic MethodHandle instance, REF_putStatic, REF_invokeStatic, four methods REF_newInvokeSpecial type of handle, The classes corresponding to these handles are not initialized;

The first four cases are fairly straightforward to understand, while (especially the last) is a little more complicated because it relates to the dynamic language support introduced in JDK 7. All situations other than these six are called passive references. A passively referenced class is not initialized immediately, as evidenced by some examples below.

1.2 Examples of passive citation

The first case is SuperClass and SubClass, two classes that have an inherited relationship. When a subclass calls static content (including fields and methods) inherited from the parent class, the virtual machine initializes only the parent class. Such as:

public class SuperClass {
    static {
        System.out.println("SuperClass has been initialized!");
    }
    protected static int value = 123;
}

class SubClass extends SuperClass{
    static {
        System.out.println("SubClass has been initialized!"); }}class MainTest{
    public static void main(String[] args) {
        // SuperClass has been initialized!
        / / 123System.out.println(SubClass.value); }}Copy the code

Compile and run the.java file. The console does not print SubClass has been Initialized! .

The second case is where an array of only one class is created, such as (SuperClass and SubClass are reused here) :

class MainTest{
    public static void main(String[] args) {
        // The virtual machine initializes [LSubClass, not SubClass.
        SuperClass[] superClasses = new SuperClass[10]; }}Copy the code

The console does not print any strings left by SuperClass or SubClass in the static field. In other words, the action of creating a SuperClass[] object has nothing to do with SuperClass or SubClass. This is because array objects are generated by Newarray bytecode, not new.

Java itself encapsulates array types so that we can get the length of an array quickly with.length or safely access array elements. And C/C++ pointer access, for out-of-bounds problems, Java programs throw exceptions and may end the program prematurely (depending on whether the user code tries catch) rather than directly reading memory illegally.

The third case is when a constant attribute of another class is directly referenced. Such as:

class ConstClass {
    static {
        System.out.println("ConstClass has been initialized!");
    }
    public static final double Pi = 3.14;   
}
class MainTest{
    public static void main(String[] args) {
        // The program does not print "ConstClass has been initialized!"System.out.println(ConstClass.Pi); }}Copy the code

The reason for this is that Javac stores the constclass.pi value 3.14 directly into the constant pool of the MainTest class through constant propagation optimization as early as compile time. Then, of course, there would be no ConstClass. There are no symbolic references to ConstClass in the compiled MainTest class. The virtual machine also saves the time needed to initialize a ConstClass.

2. Class loading process

2.1 Loading

Loading here refers to the first phase of the class loading process, where the first task is to find the binary stream (or file) corresponding to the class.

  1. Corresponds to a binary stream (file) by the fully qualified name of the class;

  2. Convert the constant pool contents of the binary stream (file) to the runtime constant pool of the method area;

  3. Create a java.lang.Class
    object as an entry point to the class.

The Java Virtual Machine Specification does not impose any strong constraints on these three. For example, for the first one:

  1. Binary streams can be extracted from.zip packages, which later evolved into.jar,.war packages.

  2. Get it from the network, such as a Web Applet (which is a bit old).

  3. Dynamic generation, best known as dynamic Proxy technology, such as JDK Proxy creates a large number of $Proxy classes.

The “fetch class binary stream” action during the load phase is one of the most controlled phases for a Java developer. They can choose to load the work delivered to the virtual machine’s built-in the bootstrap class loader, and can also write your own class loader for (here introduces a new concept “class loader”, we will be in the text in detail), through the operation of binary stream access to give the flexibility of running code application.

We have to mention the array type again. It does not depend on the classloader itself, but is created directly in memory by the virtual machine. The final element type ([LX after LXs removes all dimensions) is done through the classloader.

2.2 Verification/Connection (1/3)

The VM must ensure that the binary streams loaded into the memory strictly comply with the Java VM Constraints. Java is more secure than C/C++, for example, at the source level you can’t cross the line to access an array, or try to convert an object to an unrelated type. However, this does not mean that it cannot be done at the bytecode level, because Class files are not necessarily compiled from Java source code, and anyone familiar with Class file structures and Java bytecode instruction sequences can write malicious code through a binary editor for the virtual machine to execute.

To this end, the Java Virtual Machine has significantly increased the description and declaration of the validation phase in subsequent releases. On the whole, the verification stage generally includes the following four processes: file format verification, metadata verification, bytecode verification and symbol reference verification.

2.2.1 File Format Verification

Format verification is the most basic, the first line of verification, equivalent to checking someone to write an article “whether the style meets the requirements of the title”. This step probably has the following checkpoints:

  1. Magic number is it0xCAFE BABE(As mentioned in the previous section, this is the file type token hidden inside the binary);
  2. Whether the major and minor versions identified by the binary stream can be accepted by the VM.
  3. Whether there are constant types in the constant pool that violate the JVM Virtual Machine Specification;
  4. Whether an index to a constant pool points to a nonexistent or untyped constant;
  5. CONSTANT_UTF8_INFOType constant whether there is non-UTF-8 encoded data;
  6. .

Binary streams are fed into the method area only if file format validation is passed. Subsequent validation will no longer read the binary stream directly, but the corresponding contents in the method area.

2.2.2 Metadata Verification

The next step is to perform semantic analysis of the bytecode information, which is equivalent to checking for “every sentence in the text contains a grammatical error”. Such as:

  1. java.lang.ObjectAll other classes should have non-empty parent indexes;
  2. Whether this class inherits from a class that is not allowed to inherit (e.gfinalModified);
  3. If the class is non-abstract, does it implement all the methods required by the parent class and interface?
  4. .

2.2.3 Bytecode Verification

Bytecode validation is the most complex part of the validation phase, and is equivalent to checking that “the paragraphs are logical, relevant, and healthy” : the data flow and program flow as a whole are analyzed to make sure that the methods in the class do not harm the JVM at runtime. The main process is to extract the Code attribute from the Class file (in the property table of the method table) for validation. Such as:

  1. Operand stacks and instruction sequences work together at any given time. Avoid things like using the same operandistoreThe command “write” is usedfstoreAn incorrect instruction such as “read”;
  2. The control transfer command does not jump to an unexpected location;
  3. (forced) type conversions within methods are always justified;
  4. .

If they do not pass bytecode verification, they are bound to cause “bugs”. However, even if the sequence of instructions in a method body passes validation, there is no guarantee that the method will not cause “bugs” at runtime. That’s because “None of THE Halting programs P could judge whether PROGRAM H entered a Halting loop.” That’s a Halting Problem.

The Java team chose to optimize bytecode optimization by moving as many steps as possible into the JavAC compiler (sacrificing a little compilation time to improve runtime load efficiency). To do this, add the “StackMapTable” attribute to the Code attribute of the method table (see the extension below). It records the state that all the Basic blocks in the method body should be in at the beginning of the local change table operation stack (which I understand to be similar to the OopMap tag used in garbage collection), thus converting the type derivation of bytecode into type checking. Thus saving a lot of time.

2.2.4 Symbol reference Verification

The symbolic reference validation process does not occur until the next parsing phase. As the virtual machine increments symbolic references in the constant pool into direct references, it also checks the validity of the symbolic references. In popular terms:

  1. Can the corresponding class be found based on the fully qualified name described by the string;
  2. Whether methods and fields described by field descriptors and simple names that match methods exist in the specified class;
  3. Accessibility of classes, fields, and methods in symbolic references;
  4. .

If the symbolic reference validation fails, the Java virtual opportunity thrown Java lang. IncompatibaleClassChangeError subclass exception, such as: Java. Lang. IllegalAccessError (no access), Java. Lang. NoSuchFieldError (didn’t find this field) or Java. Lang. NoSuchMethodError (no this method), etc.

2.3 Preparation/Connection (2/3)

The preparation phase allocates memory and assigns zero values to static variables of the class (variables decorated static). Logically, such content should be allocated in the method area (not the heap) space, but in fact, after JDK 8, the string constant pool and static variables have been moved to the heap space, and other class information, runtime constant pool, etc., has been moved to the meta-space 1.

The zero assignment here needs to be emphasized. For example, there is code like this:

public static int value = 100;
Copy the code

In the preparation phase, the value will still be assigned 0 instead of 100. Because Java methods have not been executed in the preparation phase, the putStatic instruction to “assign value to 100” is stored in the

() method of the class method when the program is compiled. The following table shows zero values for Java primitive data types.

The data type Zero value The data type Zero value
int 0 boolean false
long 0L float 0.0 f
short (short) 0 double 0.0 d (double)
char \u0000 reference null
byte (byte) 0

There are special cases for assigning a value to zero, such as defining the above value as a constant:

public static final int value = 100;
Copy the code

Thus, a ConstantValue property is generated in the compiled field table. Value will be set to 100 in the preparation phase.

2.4 Resolution/Join (3/3)

The parsing phase is the phase where all symbolic references in the binary stream are formally replaced with direct references, corresponding to the symbolic reference validation process in the validation phase. Symbolic references appear in Class files as CONSTANT_Class_info, CONSTANT_Field_info, CONSTANT_Methodref_info, etc. (their values are themselves UTF-8 encoded strings).

Here is a general description of symbolic references and direct references:

Symbolic References: Targets referenced solely by symbols can be found in the constant pool using the Javap -v command.

Direct References: Direct References are Pointers, offsets, etc. that can point directly to the target. They cannot be viewed directly through the Class file because they are related to the memory layout of the VIRTUAL machine. If the target of a reference is already a direct reference, the target must already be loaded into virtual machine memory.

For example, if you write a line like this in your source code:

System.out.println("hello");
Copy the code

This line of code adds the following to the Class file’s constant pool at compile time:

Constant pool: #2 = Fieldref #16.#17 // java/lang/System.out:Ljava/io/PrintStream; #3 = String #18 // hello #4 = Methodref #19.#20 // java/io/PrintStream.println:(Ljava/lang/String;) V ... #16 = Class #23 // java/lang/System #17 = NameAndType #24:#25 // out:Ljava/io/PrintStream; #18 = Utf8 hello #19 = Class #26 // java/io/PrintStream #20 = NameAndType #27:#28 // println:(Ljava/lang/String;) V ... #26 = Utf8 java/io/PrintStream #27 = Utf8 println #28 = Utf8 (Ljava/lang/String;) VCopy the code

When generating a static Class file, Javac doesn’t know where in memory the handles to java.lang.System and its methods will be, so it’s just marked with utF-8 notation. Until the Class file is parsed, the virtual machine replaces these symbols with direct references that can be addressed directly.

To be more precise, the Java Virtual Machine Specification does not enforce the specific timing of the parsing phase, only the 17 (including getField, getStatic, LDC, new… Etc.) involves parsing before operating on bytecode instructions referenced by symbols. The virtual machine can choose to start parsing symbolic references in the constant pool as soon as the class is loaded by the classloader, or it can wait until the symbolic references are first used.

Multiple parses of the same symbol often occur, so the virtual machine can choose to cache the results after the first parses to save time on subsequent parses of the same symbol. The virtual machine needs to ensure that if the first resolution of the symbol is successful, subsequent resolution of the same symbol should also be successful.

This is not true for invokeDynamic instructions, which support dynamic languages (JRuby, Scala, and Java after JDK 8), Its citation is called “Dynamically-Computed Call Site Specifier”, where dynamic means that a program must run to that instruction before parsing can be completed.

The security check is done by symbolic reference validation in the validation phase.

The parse action is for seven classes (class/interface, field, class method, interface method, interface type, method handle, and call point qualifier) and a string type (whose conversion is very straightforward, as we can see in the constant pool), corresponding to:

  • CONSTANT_Class_info
  • CONSTANT_Fieldref_info
  • CONSTANT_Methodref_info
  • CONSTANT_InterfaceMethodref_info
  • CONSTANT_MethodType_info
  • CONSTANT_MethodHandle_info
  • CONSTANT_Dynamic_info
  • CONSTANT_InvokeDynamic_info
  • CONSTANT_String_info

The first four parsing procedures are described here, and the last four (minus string parsing) are described in conjunction with the invokeDynamic instruction on subsequent dynamic language calls.

2.4.1 Class or interface resolution

Suppose the current code is of class A, and somewhere in the code there is A symbol S that represents A direct reference to class B/interface, then:

If A is not an array type, the virtual machine will look for the class loader of class B to load B based on the fully qualified name described by the symbol S. The loading process will trigger A “chain reaction” due to the inheritance relationship of B.

If A is an array type and the element type of the array is an object (e.g. [Ljava/lang/Integer]), then the first rule loads the corresponding element type and the virtual machine generates an array object representing the dimensions and elements of the array.

Three, the former two are through the case, the rest is waiting for symbolic reference verification through, otherwise throw Java. Lang. IllegalAccessError anomalies. For JDK 9 and later, you also need to check permissions between modules, including:

  • Be access classBItself ispublicAnd classAIn a module.
  • B ( publicThe),AClasses are not in the same module, but access is allowed between the two modules.
  • BnotpublicOf, but withAUnder the same package.

2.4.2 Field Parsing

Field resolution (Fieldref_info) depends on class or interface resolution (Class_info) because fields are “fields of the class.” If class resolution fails, field resolution fails. After class resolution is complete (denoted by C), the Java Virtual Machine Specification calls for C to continue searching as follows:

  • ifCThe class itself contains this field (shown inCHas its field descriptor and simple field name in the constant pool), returns a direct reference to the field.
  • Otherwise, ifCIf the interface is implemented, the matching field is recursively searched according to the inheritance relation of the interface and returned.
  • Otherwise, ifCInherits the parent class, and is itself nonjava.lang.ObjectType, recursively searches for matched fields according to inheritance relationships and returns.
  • Otherwise, the search fails and returnsjava.lang.NoSuchFieldErrorThe exception.

For reference lookup success, the virtual machine validates the permission and may raise an IllegalAccessError exception.

For articles 2 and 3, the challenge is this: if a class inherits a field with the same name and type in its parent class and implements an interface, what should field resolution return?

public class SubClass extends SuperClass implements SuperImplement{
    public static void main(String[] args) {       
        // Reference to 'value' is ambiguous, both `SuperClass.value` and `SuperImplement.value` match.
        System.out.println(newSubClass().value); }}class SuperClass{protected String value = "..."; }interface SuperImplement{String value = "?????"; }Copy the code

In fact, Javac will reject such “vague” references at compile time, and the IDE will warn you in advance that there is a syntax error.

2.4.3 Method analysis

Note that method resolution refers specifically to method resolution of classes, and is distinguished from interface method resolution. The structure of the CONSTANT_Methodref_info table is as follows:

type The name of the describe
u1 tag Fixed for 10
u2 class_index Points to the class descriptor that declares this methodCONSTANT_Class_infoThe index of the
u2 nameAndType_index Point to names and method and type descriptorsCONSTANT_NameAndType_infoThe index of the

First of all, the virtual machine to determine the class_index point is a class C, otherwise it will directly sell IncompatibaleClassChangeError anomalies. The rest of the logic is similar to the field resolution part, but in the case of successful parsing, the interface implemented should not be in the scope of the lookup:

The C class first looks for methods that match both simple names and descriptors, and then recursively looks in the inherited parent class. If a direct reference is not returned, the interface is recursively searched. If the method in the interface is to search, obviously, this method is an abstract method, now at this time to throw the Java. Lang. AbstractMethodError anomalies.

Otherwise, NoSuchMethodError is eventually raised. In addition, even if is successfully returned to the direct reference, such as symbolic reference validation is not through, it will still return Java. Lang. IllegalAccessError.

2.4.4 Interface Method Analysis

Note that interface method resolution refers specifically to method resolution of an interface, and is distinguished from method resolution. The structure of the CONSTANT_InterfaceMethodref_info table is described here:

type The name of the describe
u1 tag Fixed for 11
u2 class_index Points to the class descriptor that declares this methodCONSTANT_Class_infoThe index of the
u2 nameAndType_index Point to names and method and type descriptorsCONSTANT_NameAndType_infoThe index of the

And method of analysis, on the other hand, the virtual machine to determine the class_index point is an interface, I can directly otherwise throw IncompatibaleClassChangeError anomalies. The remaining logical lookup logic is:

First, the I interface looks for a method that matches both the simple name and descriptor, and returns a direct reference to it.

Otherwise, the parent interface that I inherits is recursively searched until the java.lang.Object class is finally found, and a direct reference to it is returned. Note that since Java allows multiple interface inheritance, it is possible to find multiple matching direct references here. The Java Virtual Machine Specification does not restrict this further, and for different virtual machine vendors, their policy may be to return the first direct reference that satisfies the condition, or they may restrict the compiler to reject such obscure references.

Otherwise, NoSuchMethodError is raised. Prior to JDK 9, all interfaces were public by default with no modularity restrictions, making it impossible to raise IllegalAccessError during interface method resolution. However, it is also possible that access to interface methods after that may have “insufficient permissions” and raise IllegalAccessError.

2.5 Initilization

Class initialization is the last stage of class loading (after which it is used by the user program). In the previous stages, the Java VIRTUAL machine did all the work except for the loading stage, where the user could intervene by designing the class loader. It is not until the initialization phase that the Java virtual machine actually executes the Java code written by the class, handing over control to the user program.

In preparation, the Java virtual machine has assigned a zero value to this class as defined by the system. In the initialization phase, class variables and static resources are controlled based on blocks of code that the programmer writes in the source code. This is all done by the class constructor < Clinit >(), which is generated by the Javac compiler.

2.5.1 About the < Clinit >() method

This method is automatically generated by the Java compiler collecting the assignment statements of static variables and the static{} statement block (called “static statement block”) from source code. The order in which these statements are collected depends on the order in which they appear in the source code. In a static statement block, you can only access a previously declared static variable, otherwise this is illegal forward access.

public class Clazz {
    Static int I = 10; static{ i = 12; System.out.println(i); } * /
    
    static{       
        i = 12;
        // Illegal forward reference;
        System.out.println(i);
    }
    
    static int i = 10;
}
Copy the code

Unlike the instance constructor

(),

() does not require an explicit call to the parent constructor, because the Java virtual machine must ensure that the

() methods of the subclass have been executed before the

() methods of the subclass are called. In other words, the first Java virtual machine to execute the < Clinit >() method must be of java.lang.object type.



The

() method of the parent class is always better than that of the subclass. The following example verifies this:

public class SuperClass{

    static int i = 1;
    static {
        i = 3; }}class SubClass extends SuperClass{

    static int j;
    static{ j = i; }}class RunApp{
    public static void main(String[] args) { System.out.println(SubClass.j); }}Copy the code

The output of this program is 3. Also, not all classes need to generate a < Clinit >() class constructor. If the class has no static statement blocks and no static variables, it does not need a class constructor.

For interfaces, we cannot write static blocks inside them, but the compiler can still generate < Clinit >() methods for them, but unlike classes, if a parent interface is not currently in use, the parent class’s < Clinit >() methods are not executed first.

The next problem is that the Java virtual machine must ensure that the

() method is properly locked and synchronized, or that only one thread is responsible for executing the

() method. Other threads can only block and wait. If that thread has finished executing the

() method and exits, no other thread will attempt to execute the same

() method when it wakes up again.



However, if this

() is too time consuming, it can cause other threads calling the class to block as well.

import java.util.Random;

public class SuperClass {

    static int i = 1;

    static {

        i = 3;

        while (Mutex.lock == 1) {
            try {
                Thread.sleep(1000);
            } catch(InterruptedException e) { e.printStackTrace(); }}}}class Mutex {
    public static int lock = 1;
}

class RunApp {
    public static void main(String[] args) {

        Runnable b = () -> {
            while (true) {

                try {
                    // This means that there is only a 1 in 32 chance that a SuperClass will be successfully initialized per second.
                    Thread.sleep(1000);
                    int random = new Random().nextInt(31);
                    if (random == 16) {
                        Mutex.lock = 0;
                        return; }}catch(InterruptedException e) { e.printStackTrace(); }}};new Thread(b).start();

        The main thread blocks and waits until the SuperClass is successfully initialized.
        System.out.println(newSuperClass()); }}Copy the code

Class loaders

The Java design team deliberately implemented the Loading phase of class Loading outside of the Java virtual machine so that user programs could decide how to fetch a “binary stream of fully qualified named classes.” The code that implements this is called a Class Loader.

3.1 Equality of classes

For any class, there must be uniqueness within the Java virtual machine space between the classloader that loads it and the class itself. In layman’s terms, two “desserts” are not said to be the same, even if the same “recipe” ends up in the hands of different people with the same “look and taste”.

The following code demonstrates this process:

public class Clazz {

    public static void main(String[] args) throws ClassNotFoundException, IllegalAccessException, InstantiationException {

        ClassLoader myLoader = new ClassLoader() {
            @Override
            publicClass<? > loadClass(String name)throws ClassNotFoundException {
                try{
                    // Get the class name with the pathname removed. Class
                    String fileName = name.substring(name.lastIndexOf(".") + 1) + ".class";
                    InputStream is = getClass().getResourceAsStream(fileName);
                    
                    // If no content is retrieved, the parent loader loads it.
                    if(is == null) {return super.loadClass(name);
                    }

                    byte[] b = new byte[is.available()];
                    is.read(b);
                    return defineClass(name,b,0,b.length);

                }catch (IOException e){
                    throw newClassNotFoundException(name); }}}; Object o = myLoader.loadClass("com.i.classloading.Clazz").newInstance();
        Clazz clazz = new Clazz();

        // Both classes are "com.i.classloading.clazz"
        System.out.println(o.getClass());
        System.out.println(clazz.getClass());

        // The result is false
        System.out.println(o instanceof Clazz);

         // Cannot castClazz a = (Clazz) o; }}Copy the code

From the information shown in the running code, there are now two Clazz classes with the same name, and there is no cast between them. The reasons are either generated by custom class loaders or loaded by Java’s application class loaders. So these two classes are independent of each other.

Note that this is a reverse parent delegate model, because the classloader logic is to tell the parent class loader to complete it through super.loadClass(name) when it cannot load the class itself.

3.2 Parental delegation model

For Java virtual machines, there are only two types of class loaders: one is the Bootstrap ClassLoader, which is implemented in C++ language and is itself a part of the Java virtual machine. The other classloaders are separate parts of the Java virtual machine and all inherit from java.lang.classloader.

For Java developers, the division of “other classloaders” is more nuanced. For Java in JDK 8, most applications used the following three classloaders, described in descending order:

  1. Bootstrap ClassLoader: This loader is responsible for loading<JAVA_HOME>\lib, as well as-XbootclasspathThe startup class loader is implemented by C++ code as part of the Java virtual machine, and therefore cannot be directly referenced by Java code.If you want to use a launcher class loader, use it directly on the logic of your Java codenullTo represent the.

To see which paths are loaded by the bootloader:

System.out.println(System.getProperty("sun.boot.class.path").replace(";"."\n"));
Copy the code
  1. Extension ClassLoader: Extension ClassLoader is made by Sun ExtClassLoader (sun.misc.Launcher$ExtClassLoader) implementation. Responsible for the<JAVA_HOME>\lib\extorDjava.ext.dirThe class library specified by the system variable is loaded into memory. Developers can extend class loaders directly using the standard.

To see which paths the extension class loader is responsible for loading, use the following code:

System.out.println(System.getProperty("java.ext.dirs").replace(";"."\n"));
Copy the code
  1. System ClassLoader (Application ClassLoader, also known as user program ClassLoader), ClassLoader by SubAppClassLoader ( sun.misc.Launcher$AppClassLoader) implementation, responsible for the user classpathjava -classpathorDjava.class.pathThe class library specified by the system variable is loaded into memory. In most cases, the user uses the system class loader.

To see which paths the extension class loader is responsible for loading, use the following code:

System.out.println(System.getProperty("java.class.path").replace(";"."\n"));
Copy the code

Below is the custom class loader implemented by the user that “inherits” from the AppClassLoader. We’ll leave it out for the moment and give an example of how to load classes from the network.

Prior to JDK 9, these three classloaders worked together as the Parents Delegation Model. Note: this “parent” refers to the fact that each ClassLoader must have one parent (except for the Bootstrap ClassLoader), thus establishing a “parent and child” relationship.

The following code shows the inheritance relationship between class loaders:

// Get the default class loader for the user class
ClassLoader c1 = RunApp.class.getClassLoader();
System.out.println(c1.getClass().getName());

// Get c1's parent class loader
ClassLoader c2 = c1.getParent();
System.out.println(c2.getClass().getName());

// Get c2's parent classloader
ClassLoader c3 = c2.getParent();
System.out.println(c3);
Copy the code

From the printed result, we can prove: AppClassLoader < ExtClassLoader < BootstrapClassLoader. In the case of C3, as mentioned earlier, the Java code does not get the boot class loader directly, so the C3 reference will be a null pointer.

The parent delegate mechanism, in plain English, means that when there is a need to load a class, the subclass loader will always try to ask the parent loader to complete it first. This is true at each level of the class loader, and the subclass loader will only handle it if the parent class loader cannot handle it.

The biggest advantage of the parent delegation model is that it maintains the security and uniqueness of the Underlying Java components. Obviously, according to the above description, all class loading requests are passed to the starting class loader first. It is easy to deduce that classes that are “near the system” or “near the bottom” must eventually be started by the class loader; Classes that are “close to business” are usually handled by system classloaders, or even user-defined classloaders.

In other words, Java developers will never be able to load their own java.lang.Object (and everything else under Java.lang.*) through a custom classloader. If he succeeds, it will shake the foundation of the Java type system. The Java virtual machine, of course, is well aware of this, so it exerts internal protection on all of this.

The parental delegation model is not necessarily an “iron law” and can be “broken” under certain circumstances, such as the JDBC components we use frequently.

3.3 A counterexample to the parent delegation model — JDBC

It is necessary to mention SPI mechanism here, Service Provider Interface, Chinese name “Service discovery Interface”, adopts the idea of Interface oriented programming: specify input/output with the Interface, and then decide which specific class to provide implementation, so as to achieve the purpose of “hot plug” deployment.

JDBC using SPI is an example of the parent-delegate model being broken because “convention precedes implementation.” The Java library defines its related drivers, java.sql.DriverManager, java.sql.Driver, and so on (all in rt.jar, which the JDK apparently regards as important foundation components), but this is at best “reserved interfaces”. Detailed implementations for various DBAs are not given, as this is something that individual DBA vendors should consider.

Since this type of SPI interface is managed by a bootstrap ClassLoader, which itself can’t handle this kind of “reverse injection,” a Thread Context ClassLoader was introduced instead. Its job is to “call the parent class loader back to the tune class loader.” After JDK 6, this context loader can be assigned through the setContextClassLoader of the java.lang.Thread Thread class, which is the system class loader by default.

To put it more bluntly, SPI, which was originally specified and “governed” by Java libraries, will be done not by startup classloaders, but by thread context loaders (which may be system classloaders or even custom classloaders), and vendors will be able to deliver their own implementations of Java library interfaces. The Java virtual machine also loads these implementation classes correctly. When we used JDBC previously, we usually added the following line of code:

Class.forName("com.mysql.jbdc.Driver");
Copy the code

Register MySQL drivers with java.sql.DriverManager of the Java library, and then call the interface methods provided by the Java library to interact with the specified DBA.

Since JDK 6, the JAVa.util. ServiceLoader helper has been provided. We only need to the original com. Mysql. JBDC. The Driver configuration to the meta-inf/services/Java, SQL, the Driver (filename specified and implemented SPI interface is consistent with the fully qualified name), In the execution DriverManager. GetConnection (…). When getting a database connection, the Java VIRTUAL machine can scan the MySQL driver directly, eliminating the need for hard coding in the source code…… This configuration is already very close to IoC of the Spring framework.

References in this section are:

[1].SPI simple case

[2]. How does JDBC break parental delegation

[3]. What is SPI

[4].Java ClassLoader summary

[extension] StackMapTable attribute

The StackMapTable property is in the property table of the Code property in the method table and is used by the new Type Checker. It works by directly recording the Verification types that can be obtained by analyzing the data stream during the runtime in the Class file, thus greatly improving the performance of bytecode Verification.

StackMapTable attribute structure is shown in the following table:

type The name of the The number of
u2 attribute_name_index 1
u4 attribute_length 1
u2 number_of_entires 1
stack_map_frame stack_map_frame_entires number_of_entires

StackMapTable consists of StackMap frames, each of which represents a bytecode offset. Represents the state that the local scale and operand stack should have when the thread runs there for verification by the check validator.

Custom class loaders fetch binary streams from the network

This example reference: Understanding the Java class loader (a) : Java class loading principle parsing

Create a class name NetworkClassLoader as a custom ClassLoader, which should inherit from ClassLoader, overriding only the necessary findClass(String Name) method, Here we write how to get bytecode remotely from the network and load the Class information (get its Class
), and then get an instance locally and test the functionality.

The expected functionality of the class loader is: First set the root URL of the bytecode (assuming http://hadoop101/, with hadoop101:80 pointing to the Nginx server deployed on the author’s local VM). When the incoming class name (assuming is com. The i.c lassloading. TargetImpl), the class loader can automatically search for http://hadoop101/com/i/classloading/TargetImpl.class bytecode file, Load the binary into memory and return Class
instance.

To test targets, create two files: an interface named Target and its implementation class TargetImpl, which writes a simple method greet(). Their compilation files are stored in the {root}/com/ I /classloading/net folder in the remote library.

public interface Target {
    void greet(a);
}

public class TargetImpl implements Target{
    public void greet(a) {
        System.out.println("Yes"); }}Copy the code

The logic of NetworkClassLoader is as follows:

package com.i.classloading.net;

import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.net.URL;

public class NetworkClassLoader extends ClassLoader {

    private String root;

    public String getRoot(a) {
        return root;
    }

    public void setRoot(String root) {
        this.root = root;
    }

    public NetworkClassLoader(String root) {
        this.root = root;
    }


    private String toUrl(String root, String classPath){
        String replace = classPath.replace('. '.'/') + ".class";
        return  root + replace;
    }

    @Override
    protectedClass<? > findClass(String path)throws ClassNotFoundException {

        String realUrl = toUrl(root, path);

        byte[] classByte = findClassByte(realUrl);

        if(classByte == null) {throw new ClassNotFoundException();
        }else return defineClass(path,classByte,0,classByte.length);
    }

    private byte[] findClassByte(String path) {

        try {

            URL url = new URL(path);
            InputStream is = url.openStream();
            ByteArrayOutputStream baos = new ByteArrayOutputStream();

            int bufferSize = 4096;
            byte[] buffer = new byte[bufferSize];
            int bytesNumRead;

            while((bytesNumRead = is.read(buffer)) ! = -1){
                baos.write(buffer,0,bytesNumRead);
            }

            return baos.toByteArray();

        } catch (Exception e) {
            e.printStackTrace();
        }

        return null; }}Copy the code

The greet method is used to test the master function as follows:

String root = "http://hadoop101/";
String classPath = "com.i.classloading.net.TargetImpl"; Class<? > aClass =new NetworkClassLoader(root).findClass(classPath);
Object o = aClass.newInstance();

Method method = aClass.getMethod("greet");
method.invoke(o);
Copy the code

On successful execution, the console prints a YES. Alternatively, we can use it through the Target interface:

String root = "http://hadoop101/";
String classPath = "com.i.classloading.net.TargetImpl"; Class<? > aClass =new NetworkClassLoader(root).findClass(classPath);

Target target = (Target) aClass.newInstance();
target.greet();
Copy the code

  1. String constant pools, runtime constant pools, and static variables underwent a major migration during JDK 6 ~ 8. See: where are the jdK7 and 8 string constant pools, runtime constant pools, and static variables for the method area meta-space implementation? ↩