Many people are very resistant to Java class loading mechanism, because it is too difficult to understand, but we as an excellent Java engineer, or to study and learn the Java class loading mechanism, because it is very helpful for us in the future work, because it is too important in Java. This article, you have to read it. It’s so important to you.
Contents of Study:
Definition of Java class loading mechanism
The cycle and timing of class loading
Conditions that trigger class loading
The specific process of class loading
Definition of Java class loading mechanism
The data describing the Class is loaded into memory from the Class file, and the data is verified, converted, parsed, and initialized to form a Java type that can be directly used by the virtual machine. In the Java language, types are loaded, wired, and initialized at runtime. This strategy adds a slight performance overhead to class loading, but it gives Java applications a high degree of flexibility. Java’s native dynamic extensibility language features rely on runtime dynamic loading and dynamic wiring.
Second, the cycle and timing of class loading
The life cycle of a class includes: 2, Loading, Verification, Preparation, Resolution, Initialization, Using and Unloading are a combination of Loading, Verification, Preparation, Resolution, Initialization, Using and Unloading, as shown in the following illustration.
Note: Validation, preparation, and parsing are collectively referred to as connections.
The order of the load, validate, prepare, initialize, unload phases is fixed, while the parse phase is not, which in some cases can be started after initialization, in order to support runtime binding in the Java language. Loading, validation, preparation, parsing, and initialization are steps in the class loading mechanism. Note that loading here is not the same as class loading.
The Java virtual Machine specification does not enforce Loading timing. However, it is strictly required that Initialization of a class be triggered if it is not initialized in the following five cases only. Loading and wiring occurs naturally before initialization.
The actions of the following five scenarios are called active references to a class. Except for active references, all references to a class in ways that do not trigger initialization of the class are called passive references.
1. The class needs to be initialized when it encounters four bytecode instructions: new(instantiate an object), getstatic(read static fields other than constants), putstatic(set to read static fields other than constants), or invokestatic(invokestatic methods of the class).
2. When making a reflection call to a class using a java.lang.reflect method.
3. When initializing a class, if its parent class is not initialized, initialize its parent class first (except the interface, which is initialized only when the parent interface is used).
4. When the VM starts, the user initializes the main execution class (containing the main method).
5. When using JDK1.7 dynamic language support, Java. Lang. Invoke. Final analytical results for MethodHandle instance REF_getStatic, REF_putStatic, REF_invokeStatic method handles, The class corresponding to the method handle needs to be initialized.
Conditions that trigger class loading
When you need to start the first phase of the class loading process: loading. There are no constraints in the Java Virtual Machine specification, which can be left to the implementation of the virtual machine. For the initialization phase, however, the virtual machine specification specifies that there are only five cases in which classes must be “initialized” immediately, and that loading, validation, and preparation naturally need to begin before that. The five cases are as follows:
1. When encountering four bytecode instructions (New,getstatic,putstatic, or Invokestatic), if the class has not been initialized, the initialization needs to be triggered first. The most common Java code scenarios for generating these four instructions are when an object is instantiated using the new keyword, when static fields of a class are read or set (except static fields that are final and have put the result into the constant pool at compile time), and when static methods of a class are called.
2. When calling some reflection methods in the Java API, such as those in Class or in the java.lang.Reflect package, to make a reflection call to a Class, you need to trigger initialization if the Class has not already been initialized.
3. When initializing a class, if the parent class has not been initialized, initialize the parent first.
4. When the VM starts, the user needs to specify a primary class (the one containing the main() method) to execute. The vm initializes this primary class first.
5. When using JDK1.7 dynamic language support, if a Java lang. Invoke. The final analytical results REF_getStatic MethodHandle instance, REF_putStatic, REF_invokeStatic method handles, If the class corresponding to the method handle is not initialized, initialize it first. The VIRTUAL machine specification uses a strong qualifier for the five scenarios that trigger class initialization: “If and only”, the action in these five scenarios is called an active reference to a class. In addition, all ways of referring to a class do not trigger initialization, called passive references.
Fourth, the specific process of class loading mechanism
Loading is a phase of the class loading process, and the two concepts must not be confused. During the load phase, the virtual machine needs to do the following three things:
(1) Get the binary byte stream that defines a class by its fully qualified name.
(2) Convert the static storage structure represented by this byte stream into the runtime data structure of the method area.
(3) Read the class file into memory and create a java.lang. class object for it. That is, when any class is used in the program, the system will create a Java.lang. class object for it, as the method area of the class data access point.
Using different class loaders, you can load the binary data of a class from different sources, usually the following:
(1) Load the class file from the local file system;
(2) Extract a Java class file from a ZIP, JAR, CAB, or other archive file. The database driver used for JDBC programming is placed in the JAR file. The JVM can load the class file directly from the JAR package.
(3) Load class files through the network, the most typical application of this scenario is Applet;
Dynamic Proxy is the most commonly used technique for dynamically compiling a Java source file and performing load runtime computations. In java.lang.reflection.proxy, Is to use the ProxyGenerator. GenerateProxyClass for particular interface generated form of Proxy class for “* $Proxy” binary byte streams.
Connection of a class
When the Class is loaded, the system generates a corresponding Class object for it and then enters the connect phase, which is responsible for merging the binaries of the Class into the JRE. Class join is divided into the following three stages:
Validation: The validation phase is used to verify that the loaded class has the correct internal structure and is compatible with other classes.
Preparation: The preparation phase is responsible for allocating memory for static properties of the class and setting default initialization values;
Parsing: To replace symbolic references in a class’s binary data with direct references (symbolic references describe the referenced object with a set of symbols; A direct reference is a pointer to a target.
validation
Validation is the first step in the connection phase. The purpose of this phase is to ensure that the information contained in the byte stream of the Class file meets the requirements of the current virtual machine and does not compromise the security of the virtual machine.
The Java language itself is a relatively safe language, but as mentioned earlier, Class files do not necessarily need to be compiled from Java source. You can produce Class files using any means, including writing them directly with a hexadecimal compiler. At the bytecode language level, everything that Java code can’t do is possible, at least semantically expressible. A virtual machine that does not check the input byte stream and trusts it completely can crash the system by loading harmful byte streams, so validation is an important part of the virtual machine’s efforts to protect itself. On the whole, the verification phase completes the verification process of the following four stages: file format verification, metadata verification, bytecode verification, symbol reference verification.
1. File format verification
The first step is to verify that the byte stream complies with the Class file format specification and can be processed by the current version of the virtual machine. This phase may include the following verification points:
(1) whether to start with magic number 0xCAFEBABE.
(2) Check whether the major and minor versions are within the processing range of the current VM.
(3) Whether there are unsupported constant types in the constant pool (check the constant tag).
(4) Whether any index value pointing to constants does not exist or does not conform to the installation type.
(5) Whether the constant of type CONSTANT_Utf8_info does not conform to UTF8 encoding data.
(6) Whether any other information has been deleted or added to the various parts of the Class file and the file itself. In fact, there are more verification points in the first phase. These are just a few excerpts from the HotSpot VIRTUAL machine source code. Only after passing the verification of this stage, the byte stream will enter the method area of memory for storage, so the following three verification stages are based on the storage structure of the method area, and will not directly operate the byte stream.
2. Metadata validation
The second stage is semantic analysis of the information described by bytecode to ensure that the information described conforms to the requirements of Java language specifications. Verification points may be included in this stage as follows:
(1) Does this class have a parent (all classes except java.lang.0bject should have a parent)
(2) Whether the parent of this class inherits classes that are not allowed to be inherited (classes decorated by finaI)
(3) if the class is not abstract, whether it implements all the methods required in its parent class or interface
(4) Whether the fields and methods in the class are in conflict with the parent class (for example, the final fields of the parent class are overwritten, or the method overloading is not in accordance with the rules, for example, the method parameters are the same, but the return value type is different, etc.)
The verification point of the second stage is also far more than these. The main purpose of this stage is to conduct semantic verification on the metadata information of the class to ensure that there is no metadata information that does not conform to the Java language specification.
3. Bytecode verification
The third stage is the most complex one in the whole validation process. The main purpose is to determine the semantics are legitimate through the analysis of data flow and control flow. Symbolic logic. After verifying the data types in the metadata information in the second phase, the method body of the class is verified and analyzed to ensure that the methods of the verified class do not endanger VM security during running. For example:
(1) Ensure that the data type of operand stack and the sequence of instruction code can work together at any time. For example, there will not be such a situation: the data of int type is placed in the operation stack, but it is loaded into the local variable table according to the type of LONG.
(2) Ensure that the jump instruction will not jump to bytecode instructions outside the method body.
(3) to ensure the type conversion method body is effective, for example, can put a subclass object assignment to the parent class data type, which is safe, but the superclass object meaning assigned to subclass data types, even the object assignment give it no inheritance relationships, and completely irrelevant to a data type, is dangerous and illegal.
(4) Even if a method body passes bytecode verification, it does not necessarily mean that it is secure.
4. Symbol reference validation
The final stage of validation occurs when the virtual machine converts symbolic references to direct references, which takes place during the parse phase, the third stage of the connection. Symbolic reference validation can be regarded as the verification of the matching of information outside the class itself (various symbolic references in the constant pool), usually requiring verification of the following:
(1) Whether the corresponding class can be found for fully qualified names described by strings in symbolic references
(2) Whether there are field descriptors matching methods and methods and fields described by simple names in the specified class.
(3) Whether the accessibility of classes, fields and methods in symbolic references (private, protected, public, default) can be accessed by the current class
Notation to refer to the purpose of the validation is to ensure that the parsing action executed properly, if not through reference symbol verification, will be thrown a Java lang. Abnormal IncompatibleClassChangError subclass, Such as Java. Lang. IllegalAccessError, Java. Lang. NoSuchFieldError, Java. Lang. NoSuchMethodError, etc. The validation phase is a very important, but not necessarily necessary, phase for the virtual machine mount and load mechanism (because it has no impact on the program). If all the code you’re running (both your own code and code in third-party packages) has been used and verified over and over again, consider using Xverify during the implementation phase. None parameter to turn off most of the validation measures to shorten the virtual machine class loading time.
To prepare
The preparation phase is the phase where memory is formally allocated for class variables and initial values are set. The memory used by these variables is allocated in the method area. Two confusing concepts need to be emphasized at this stage. First, only class variables (static modified variables) are allocated, not instance variables, which will be allocated in the Java heap along with the object when it is instantiated. Second, the initial values mentioned here are “normally” zero values for the data type.
parsing
The parsing phase is the process in which the virtual machine converts symbolic references in the constant pool into direct references. What are symbolic applications and direct references?
(1) symbol reference (Symlxiuc References): symbol reference to a group of symbols to describe the reference of the day mark, symbol can be any form of literal, as long as the use can be unambiguous to locate the target, special reference and configuration machine memory 1 fabric. Office 11I-US, the referenced day mark may not have been clipped to memory.
(2) Direct References: A Direct reference can be a pointer directly to the target, a relative offset, or a handle that can be indirectly located to the target. A direct reference is related to the memory layout implemented by the VIRTUAL machine. The translation of a symbolic reference on different virtual machine instances will not be the same. If there is a direct reference, the target of the reference must already exist in memory. The unnew action is performed for class or interface, field, class method, interface method, method type, method handle, and call point qualifier 7 class symbol references corresponding to CONSTANT_Class_info, CONSTANT_Fieldref_info, CONSTANT_Methodref_info, CONSTANT_IntrfaceMethodref_info, CONSTANT_MethodType_info, CONSTANT_MethodHandle_info, and CONSTANT_InvokeDynamic_info7 constant types.
Summary: Ok, so that’s the class loading mechanism, but there’s one more important concept, and that’s the class loader. However, in the next post I will give you an interview question on class loading mechanics. They’re all from big factories!
Finally, I’m going to update the classloader as well, but beyond that, I’ll keep updating everything you want to learn about Java.