Note source: Silicon Valley JVM complete tutorial, millions of playback, the peak of the entire network (Song Hongkang details Java virtual machine)

Update: gitee.com/vectorx/NOT…

Codechina.csdn.net/qq_35925558…

Github.com/uxiahnan/NO…

@[toc]

1. An overview of the

Data types in Java are divided into basic data types and reference data types. Basic data types are predefined by the VIRTUAL machine, while reference data types require class loading.

According to the Java Virtual Machine specification, the entire life cycle of a class file, a class loaded into memory, and a class unloaded from memory consists of the following seven phases:

Among them, validation, preparation and parsing are collectively referred to as Linking.

From the use of classes in the program

Big factory interview question

Ant Financial:

Describe how the JVM loads Class files.

Side: the class loading process

Baidu:

The timing of class loading

Java class loading process?

What is the Java class loading mechanism?

Tencent:

JVM class loading mechanism, class loading process?

Details:

JVM class loading mechanism

Meituan:

Java class loading process

Describe how the JVM loads class files

Jingdong:

What is class loading?

What circumstances trigger class loading?

So let’s talk about how the JVM loads a class what is the class loading mechanism of the JVM?


2. Process 1: Loading stage

2.1. Loading completed operations

Understanding of loading

Loading, in short, is to load the bytecode files of Java classes into machine memory, and build the prototype of Java classes in memory — class template objects. Color {red}{loading, in short, is to load the Java class bytecode file into the machine memory, and in the memory to build the Java class prototype – class template object. } Loading, in short, means loading the bytecode files of Java classes into machine memory, and building the prototype of Java classes in memory — class template objects. So-called class template object, in fact, the Java classes in] a snapshot of the VM memory, the JVM will from the bytecode file parsing out the constant pool class fields, methods, such as information stored in the class template, such] the VM at run time can obtain any information on the Java class by class template, the ability to traverse the Java class member variables, Java method calls can also be made.

The mechanism of reflection is based on this foundation. If the JVM does not store the Java class declaration information, the JVM cannot reflect it at run time.

Loading completed operations


The load phase, in short, looks up and loads the binary data of the class, which is generated C l a s s The instance. Color {red}{the loading phase, in short, finds and loads the binary data of the Class, generating an instance of the Class. }

When loading a class, the Java virtual machine must do three things:

  • Gets the binary data stream of the class by its full name.

  • Parse the binary data flow of a class as a data structure in the method area (Java Class model)

  • Create an instance of the java.lang.Class Class representing this type. The method area is the entry point to the various data of this class

2.2. Method of obtaining binary stream

Virtual machines can generate or obtain binary data streams for classes in a variety of ways. (As long as the bytecode read conforms to the JVM specification)

  • The virtual machine may read a file with the class suffix through the file system (most common) \color{red}{(most common)} (most common)
  • Read archive data packets such as JAR and ZIP and extract class files.
  • Binary data for a class previously stored in a database
  • Load over the network using a protocol like HTTP
  • Generate a class binary at runtime, etc
  • Once the binary information of the Class is retrieved, the Java virtual machine processes the data and eventually turns it into an instance of java.lang.class.

If the input data is not structured as a ClassFile, a ClassFormatError is raised.

2.3. Location of Class models and Class instances

Location of the class model

The loaded classes create the corresponding class structure in the JVM, which is stored in the method area (before JDkL.8: permanent generation; J0kl.8 and later: meta-space).

Location of the Class instance

After the class loads the.Class file into the meta-space, a java.lang. class object is created in the heap to encapsulate the data structure of the class in the method area. The class object is created during the class loading process, and each class has an object of type class.

Class clazz = Class.forName("java.lang.String");
// Get all the methods declared by the current runtime class
Method[] ms = clazz.getDecla#FF0000Methods();
for (Method m : ms) {
    // Get the modifier of the method
    String mod = Modifier.toString(m.getModifiers());
    System.out.print(mod + "");
    // Get the return value type of the method
    String returnType = (m.getReturnType()).getSimpleName();
    System.out.print(returnType + "");
    // Get the method name
    System.out.print(m.getName() + "(");
    // Get the argument list for the methodClass<? >[] ps = m.getParameterTypes();if (ps.length == 0) {
        System.out.print(') ');
    }
    for (int i = 0; i < ps.length; i++) {
        char end = (i == ps.length - 1)?') ' : ', ';
        // Get the type of the inputSystem.out.print(ps[i].getSimpleName() + end); }}Copy the code

2.4. Loading of array classes

Creating an array class is a slightly special case because the array class itself is not created by the classloader, but is created directly by the JVM at run time on demand, but the element types of the array are still created by the classloader. Procedure for creating an array class (hereinafter referred to as A) :

  • If the element type of the array is A reference type, the defined loading procedure is followed to recursively load and create the element type of array A;
  • The JVM creates a new array class using the specified element type and array dimension.

If the element type of the array is a reference type, the accessibility of the array class is determined by the accessibility of the element type. Otherwise, the accessibility of the array class is defined as public by default.


3. Process 2: Linking stage

3.1. Link 1: Link Stage Verification

When a class is loaded into the system, it starts linking, and validation is the first step in linking.


Its purpose is to ensure that the loaded bytecode is legal, reasonable, and compliant with the specification. Color {red}{its purpose is to ensure that the loaded bytecode is legal, reasonable, and compliant with the specification. }

The verification procedure is complicated, and there are many actual items to be verified. Generally speaking, Java virtual machines need to perform the following checks, as shown in the figure.

Overall description:

The verification content covers the format verification, semantic verification, bytecode verification and symbol reference verification of class data information.

  • Where format validation is performed with the load phase \color{red}{where format validation is performed with the load phase} where format validation is performed with the load phase. After validation, the class loader will successfully load the binary data information of the class into the method area.
  • Validation operations other than format validation will be performed in the method area.

Validation in the link phase slows down the load, but it avoids the need for various checks while the bytecode is running. (Sharpening the knife does not miss the wood worker)

Specific instructions:

  1. Format verification: Whether it starts with the magic number 0XCAFEBABE, whether the major and minor version numbers are supported by the current Java VIRTUAL machine, and whether each item in the data has the correct length.

  2. Semantic check: The Java virtual machine performs semantic check for bytecodes. If the bytecodes do not meet the specifications semantically, the virtual machine will not pass the verification. Such as:

    • Whether all classes have a parent class (in Java, all classes except Object should have a parent class)
    • Are methods or classes defined as final overridden or inherited
    • Whether a non-abstract class implements all abstract or interface methods
  3. Bytecode verification: The Java virtual machine also performs bytecode verification. Bytecode verification is the most complicated part of the verification process. It attempts to determine whether bytecode can be executed correctly by analyzing bytecode streams. Such as:

    • Whether to jump to a nonexistent instruction during the execution of bytecode
    • Whether the call to a function passes the correct type of argument
    • Is the assignment of a variable to the correct data type, etc

    A StackMapTable is used at this stage to detect whether the local table of variables and operand stacks have the correct data type at a particular bytecode. Unfortunately, it is impossible to determine with 100% accuracy whether a piece of bytecode can be safely executed, so the procedure simply checks for as many obvious problems as it can anticipate. If the check fails at this stage, the virtual machine will not load the class correctly. However, passing this stage does not mean that the class is completely problem-free.


    In the previous 3 In this check, file format errors, semantic errors, and bytecode inaccuracies have been eliminated. But there is still no guarantee that the class is problem-free. Color {red}{In the previous three checks, file format errors, semantic errors, and bytecode inaccuracies were eliminated. But there is still no guarantee that the class is problem-free. }

  4. Validation of symbolic references: The validator also validates symbolic references. The Class file keeps a string of other classes or methods in its constant pool that it will use. In the validation phase, therefore, the virtual machine will check these classes or methods exists \ color {red} {the virtual machine will check these classes or methods exists} the virtual machine will check these classes or methods exists, and the current class has access to these data, if need to use a class cannot be found in the system, NoClassDefFoundError is raised, or NoSuchMethodError is raised if a method cannot be found. This phase is only performed during the parsing session.

3.2. Part 2: Preparation at the link stage


Preparation stage ( P r e p a r a t i o n ), in short, allocate memory for static changes to the class and initialize it to the default value. Color {red}{Preparation, in short, allocates memory for static changes to classes and initializes it to default values. }

When a class is validated, the virtual machine enters the preparation phase. At this stage, the virtual machine allocates the appropriate memory space for the class and sets the default initial values. The following table lists the default initial values for Java VM variables.

type Default initial value
byte (byte)0
short (short)0
int 0
long 0L
float 0.0 f
double 0.0
char \u0000
boolean false
reference null

Java does not support Boolean types. For Boolean types, the internal implementation is int. Since the default value of int is 0, the corresponding Boolean default value is false.

Pay attention to


  • Fields that do not contain basic data types are used here s t a t i c f i n a l Modification of the case because f i n a l It is assigned at compile time and explicitly assigned during preparation. Color {red}{this does not include fields with basic datatypes that are static final, because final is assigned at compile time and is explicitly assigned during preparation. }

    // General: Static final base datatypes, string literals are assigned during preparation
    private static final String str = "Hello world";
    // Special case: Static final modified reference types are not assigned in the preparation phase, but in the initialization phase
    private static final String str = new String("Hello world");
    Copy the code
  • Note that there is no initialization for instance variable allocation, class variables are allocated in the method area, and instance variables are allocated in the Java heap along with the object.

  • There is no initialization or code executed in this phase as there is in the initialization phase.

3.3. Step 3: Resolution of link Stage

After the preparation phase is complete, it’s time to move on to the parsing phase. Resolution, in short, turns symbolic references to classes, interfaces, fields, and methods into direct references.

Specific description:

Symbolic references are literal references that have nothing to do with the virtual machine’s internal data structure or memory layout. It’s easy to understand that there are a lot of symbolic references through the constant pool in the Class Class file. But symbolic references are not enough when the program is actually running, such as when the following println() method is called. The system needs to know exactly where the method is located.

For example:

Output the bytecode corresponding to the system.out.println () operation:

invokevirtual #24 <java/io/PrintStream.println>
Copy the code

Taking methods as an example, the Java virtual machine prepares a method table for each class and lists all its methods in the table. When a method of a class needs to be called, the method can be directly called as long as the offset of the method in the method table is known. By parsing, symbolic references can be converted to the position of the target method in the method table of the class, allowing the method to be successfully invoked. \color{red}{By parsing, symbolic references can be converted to the position of the target method in the method table of the class, allowing the method to be called successfully. } By parsing, symbolic references can be converted to the position of the target method in the method table of the class, resulting in the method being successfully invoked.


4. Process 3: Initialization

4.1. static and final

Static + final field explicit assignment operation, exactly in which stage of assignment?

  • Case 1: Assign during the preparation phase of the link phase

  • Case 2: Assignment in the initialization phase <clinit>()

Conclusion: In the link stage of the preparation of the assignment:

  • For fields of primitive data types, if static final modifiers are used, explicit assignment (assigning constants directly instead of calling methods is usually done in the preparation part of the link phase

  • In the case of strings, explicit assignments are usually made during the preparation of the link phase if literal assignments are used and static final decorations are used

  • Assignment in the initialization phase <clinit>() : Excluding the assignment in the preparation phase described above.

Final conclusion: The use of static+final decorations, and the display assignment of explicit values from basic data classes to or strings that do not involve method or constructor calls, is done in the preparation part of the link phase.

public static final int INT_CONSTANT = 10;                                // Assign a value in the preparation part of the link phase
public static final int NUM1 = new Random().nextInt(10);                  Clinit >() in the initialization phase
public static int a = 1;                                                  // Assign value in the initialization phase 
      
       ()
      

public static final Integer INTEGER_CONSTANT1 = Integer.valueOf(100);     // Assign value in the initialization phase 
      
       ()
      
public static Integer INTEGER_CONSTANT2 = Integer.valueOf(100);           // In the initialization phase 
      
       ()
      

public static final String s0 = "helloworld0";                            // Assign a value in the preparation part of the link phase
public static final String s1 = new String("helloworld1");                // Assign value in the initialization phase 
      
       ()
      
public static String s2 = "hellowrold2";                                  // Assign value in the initialization phase 
      
       ()
      
Copy the code

4.2. Thread safety of < Clinit >()

For the invocation of the < Clinit >() method, the initialization of the class, the virtual machine internally ensures security in its multithreaded environment.

The virtual machine ensures that a class’s () methods are locked and synchronized correctly in a multithreaded environment. If multiple threads initialize a class at the same time, only one thread will execute the class’s <clinit>() method, and all the other threads will block until the active thread completes executing the <clinit>() method.

Because the function

() is thread-safe, multiple threads can block if there is a long operation in the

() method of a class, Causes a deadlock. And deadlocks are hard to find because they don’t seem to have lock information available.

If the previous thread successfully loaded the class, the thread waiting in the queue would have no chance to execute the <clinit>() method. Then, when it needs to use this class, the virtual machine simply returns the information it has prepared.

4.3. Class initialization: Active vs. passive use

Java programs can use classes in two ways: active and passive.

Take the initiative to use

Classes are only loaded when they must be used for the first time, and Java virtual machines do not load Class types unconditionally. Java virtual machines (VMS) specify that a class or interface must be initialized before being used for the first time. The “use” here refers to the active use, active use only in the following cases: (that is, if the following situation occurs, the class will be initialized. The loading, validation, and preparation before initialization is complete.

  1. Instantiation: When creating an instance of a class, such as using the new keyword, or by reflection, cloning, or deserialization.

    /** * deserialize */
    Class Order implements Serializable {
        static {
            System.out.println("Initialization of the Order class"); }}public void test(a) {
        ObjectOutputStream oos = null;
        ObjectInputStream ois = null;
        try {
            / / the serialization
            oos = new ObjectOutputStream(new FileOutputStream("order.dat"));
            oos.writeObject(new Order());
            // deserialize
            ois = new ObjectInputStream(new FileOutputStream("order.dat"));
            Order order = ois.readObject();
        }
        catch (IOException e){
            e.printStackTrace();
        }
        catch (ClassNotFoundException e){
            e.printStackTrace();
        }
        finally {
            try {
                if(oos ! =null) {
                    oos.close();
                }
                if(ois ! =null) { ois.close(); }}catch(IOException e){ e.printStackTrace(); }}}Copy the code
  2. Static methods: When a static method of a class is invoked, that is, when the bytecode Invokestatic instruction is used.

  3. Static fields: When using static fields of a class or interface (final modifications are special), for example, use getStatic or putStatic directives. (Corresponding to variable access and variable assignment operations)

    public class ActiveUse {
    	@Test
        public void test(a) { System.out.println(User.num); }}class User {
        static {
            System.out.println("Initialization of the User class");
        }
        public static final int num = 1;
    }
    Copy the code
  4. Reflection: When a class’s methods are reflected using methods in the java.lang.Reflect package. Such as: class.forname (” com. Atguigu. Java. Test “)

  5. Inheritance: When initializing a child class, if the parent class has not been initialized, the initialization of the parent class needs to be triggered first.

    When a Java virtual machine initializes a class, it requires that all of its parent classes have been initialized, but this rule does not apply to interfaces.

    • When a class is initialized, the interface it implements is not initialized first
    • When an interface is initialized, its parent interface is not initialized first
    • Therefore, a parent interface is not initialized because its child interface or implementation class is initialized. Only the first time a program uses a static field for a particular interface causes that interface to be initialized.
  6. Default method: If an interface defines a default method, then classes that implement the interface directly or indirectly are initialized before the interface is initialized.

    interface Compare {
    	public static final Thread t = new Thread() {
            {
                System.out.println("Initialize the Compare interface"); }}}Copy the code
  7. Main method: When the virtual machine starts, the user needs to specify a main class (the one containing the main() method) to execute. The virtual machine initializes this main class first.

    The VM starts with an initial class loaded through the boot class loader. This class is linked and initialized before calling the public static void main(String[]) method. Execution of this method in turn results in the loading, linking, and initialization of the required classes.

  8. MethodHandle: Initializes the class to which the method points when you first call a MethodHandle instance. (Involves parsing classes corresponding to the handles of REF getStatic, REF_putStatic and REF invokeStatic methods)

Passive use

In addition to the above cases belong to active use, other cases are passive use. Passive use does not cause class initialization. \color{red}{Passive use does not cause class initialization. } Passive use does not cause class initialization.

That is: classes that appear in code are not necessarily loaded or initialized. \color{red}{is not a class that appears in the code and must be loaded or initialized. } classes are not necessarily loaded or initialized as they appear in code. If the criteria for active use are not met, the class is not initialized.

  1. Static fields: When referring to a static variable of a parent class through a subclass, this does not cause the subclass to be initialized. Only classes that actually declare this field are initialized.

    public class PassiveUse {
     	@Test
        public void test(a) { System.out.println(Child.num); }}class Child extends Parent {
        static {
            System.out.println("Initialization of the Child class"); }}class Parent {
        static {
            System.out.println("Initialization of the Parent class");
        }
        
        public static int num = 1;
    }
    Copy the code
  2. Array definition: Defining a class reference through an array does not trigger initialization of the class

    Parent[] parents= new Parent[10];
    System.out.println(parents.getClass()); 
    // new is initialized
    parents[0] = new Parent();
    Copy the code
  3. Referencing constants: Referencing constants does not trigger initialization of this class or interface. Constants are explicitly assigned at the link stage.

    public class PassiveUse {
        public static void main(String[] args) {
            System.out.println(Serival.num);
            // References to other classes will still be initializedSystem.out.println(Serival.num2); }}interface Serival {
        public static final Thread t = new Thread() {
            {
                System.out.println("Serival initialization"); }};public static int num = 10; 
        public static final int num2 = new Random().nextInt(10);
    }
    Copy the code
  4. LoadClass method: Calling the loadClass() method of the ClassLoader class to load a class is not an active use of the class and does not result in class initialization.

    Class clazz = ClassLoader.getSystemClassLoader().loadClass("com.test.java.Person");
    Copy the code

extension

-xx :+TraceClassLoading: Tracing the loading information of printed classes


5. Process 4: Using the class

Any type must go through the complete three class loading steps of loading, linking, and initializing before it can be used. Once a genre has successfully gone through these three steps, it’s ready to be used by developers.

A developer can access and invoke its static class member information (e.g., static fields, static methods) within a program, or create an object instance for it using the new keyword.


A class should Unloading.

6.1. Reference relationships among classes, class loaders, and class instances

In the internal implementation of the class loader, a Java collection is used to hold references to loaded classes. A Class object, on the other hand, always references its classloader, which can be obtained by calling the Class object’s getClassLoader() method. Thus, there is a bidirectional relationship between a Class instance representing a Class and its Class loader.

An instance of a Class always refers to the Class object that represents that Class. The getClass() method is defined in the Object Class, which returns a reference to the Class Object representing the Class to which the Object belongs. In addition, all Java classes have a static attribute class that refers to the class object representing the class.

6.2. Life cycle of a class

Once the Sample class is loaded, linked, and initialized, its life cycle begins. When the Class object representing the Sample Class is no longer referenced, that is, untouchable, the Class object ends its life cycle and the data of the Sample Class in the method area is unloaded, thus ending the life cycle of the Sample Class.


When a class ends its life cycle depends on what represents it C l a s s When an object ends its life cycle. Color {red}{when a Class ends its life cycle depends on when the Class object representing it ends its life cycle. }

6.3. Specific examples

The loader1 and obj variables indirectly apply a Class object representing the Sample Class, while the objClass variable references it directly.

If the program is running and the three reference variables on the left of the figure are set to NULL, the Sample object ends its life cycle, the MyClassLoader object ends its life cycle, and the Class object representing the Sample Class also ends its life cycle, and the binary data of the Sample Class in the method area is unloaded.

When needed again, the system checks whether the Class object of the Sample Class exists. If it does, the system uses it without reloading. If the Sample Class does not exist, it will be reloaded, and a new Class instance representing the Sample Class will be generated in the Java virtual machine heap (you can use the hash code to check if it is the same instance).

6.4. Uninstallation of classes

(1) Boot class loaders load types that cannot be unloaded during the entire run (JVM and JLS specifications)

(2) The types loaded by system class loaders and extension class loaders are unlikely to be unloaded during the run, because instances of system class loaders or extension classes can almost always be accessed directly or indirectly during the run, and their unreachable possibilities are minimal.

(3) Types loaded by a developer-defined class loader instance can only be unloaded in a very simple context, usually by forcing a call to the virtual machine’s garbage collection. It can be expected that in slightly more complex application scenarios (e.g., many times users use caching when developing custom classloader instances to improve system performance), loaded types are almost unlikely to be unloaded at runtime (at least for an indefinite period of time).

Combined with the above three points, a loaded type has a very low chance of being unloaded at least for an indefinite period of time. At the same time, we can see that developers should not make any assumptions about the type of virtual machine uninstall when developing code to implement specific functions in the system.

Review: Method area garbage collection

The method area garbage collection collects two main parts: obsolete constants in the constant pool and types that are no longer used.

The HotSpot VIRTUAL machine has a very clear policy for reclaiming constants from the constant pool as long as they are not referenced anywhere.

Determining whether a constant is “obsolete” is relatively easy, while determining whether a type is a “deprecated class” is more demanding. The following three conditions must be met simultaneously:


  • All instances of the class have been reclaimed. That is J a v a There is no instance of this class or any of its derived children in the heap. \color{blue}{all instances of this class have been reclaimed. That is, there is no instance of this class or any of its derived children in the Java heap. }

  • The classloader that loaded the class has been reclaimed. This condition is only applicable to carefully designed scenarios where an alternative class loader is available, such as O S G i , J S P Reloading, etc., otherwise is usually difficult to achieve. \color{blue}{The classloader that loaded the class has been reclaimed. This condition is usually difficult to achieve except in well-designed scenarios with alternative classloaders, such as OSGi, JSP reloading, and so on. }

  • Corresponding to the class j a v a . l a n g . C l a s s Object is not referenced anywhere, and there is no way to access its methods through reflection anywhere. \color{blue}{The java.lang.Class object corresponding to this Class is not referenced anywhere, and the methods of this Class cannot be accessed anywhere by reflection. }

The Java virtual machine (JVM) is allowed to recycle useless classes that meet the above three criteria, only “allowed”, not necessarily recycled as objects do when they are no longer referenced.


Previous < JVM MIDI: Loading bytecodes and Classes > 02- Bytecode instruction sets

The next article, part OF the JVM: Loading Bytecodes and Classes, 04- more on class loaders