It is well known that Java supports platform independence, security, and network mobility. The Java platform, which consists of Java virtual machines and Java core classes, provides a unified programming interface for pure Java programs regardless of the underlying operating system. Thanks to the Java Virtual machine, its “compile once, run anywhere” mantra is guaranteed.

1.1 Java Program Execution Process The execution of Java programs depends on the compilation environment and the runtime environment. The transformation of source code into executable machine code is accomplished by the following process:

The core of Java technology is the Java Virtual machine, because all Java programs run on the virtual machine. Java programs need a Java virtual machine, A Java API, and a Java Class file to run. A Java virtual machine instance is responsible for running a Java program. When a Java program is launched, a virtual machine instance is created. When the program ends, the virtual machine instance dies.

The cross-platform nature of Java because it has virtual machines for different platforms.

1.2 Java VIRTUAL Machine The Java VIRTUAL machine (JVM) loads class files and executes bytecodes in them. As you can see from the following figure, the Java VIRTUAL machine contains a class loader that loads class files from programs and apis. Only the classes required for program execution are loaded from the Java API, and the bytecode is executed by the execution engine.

When a Java VIRTUAL machine is implemented by software on the host operating system, Java programs interact with the host by calling local methods. Java methods are written in the Java language, compiled into bytecode and stored in class files. Native methods are written in C/C++/ assembly language and compiled into processor-specific machine code stored in a dynamically linked library in a platform-specific format. So the local method is how the Java program connects to the underlying host operating system.

Because the Java virtual machine has no idea how a class file was created or whether it was tampered with, it implements a class file detector to ensure that the types defined in the class file can be used safely. The class file checker ensures the robustness of the program through four separate scans:

Class file structure check type data semantics check bytecode validation symbol references verify that the Java virtual machine performs other built-in security operations when executing bytecode, which are features of the Java programming language that ensure Java program robustness, as well as features of the Java Virtual machine:

Type-safe reference conversion Structured memory access Automatic garbage collection Array boundary checking Empty reference checking 1.3 Java Virtual Machine Data types Java virtual machines perform computations using certain data types. Data types can be divided into two types: basic type and reference type, as shown below:

But Boolean is a bit special. When the compiler compiles Java source into bytecode, it represents Boolean in terms of int or byte. In the Java virtual machine, false is represented by 0 and true is represented by any non-zero integer. As with the Java language, the range of values for Java virtual machine primitives is consistent everywhere, regardless of the host platform, and a LONG is always a signed integer with 64-bit binary complement in any virtual machine.

For returnAddress, this basic type is used to implement the finally clause in Java programs. Java programmers cannot use this type, whose value points to the opcode of a virtual machine instruction.

2 Architecture In the Java Virtual Machine specification, the behavior of a virtual machine instance is described in terms of subsystems, memory areas, data types, and instructions, which together represent the abstract internal architecture of a virtual machine.

2.1 Class Files Java Class files contain all information about a class or interface. The “basic types” of class files are as follows:

U1 is 1 byte, unsigned U2 is 2 bytes, unsigned U4 is 4 bytes, unsigned U8 is 8 bytes, and unsigned u8 is 8 bytes. The Java® Virtual Machine Specification

The class file contains:

ClassFile {

u4 magic; // the magic number is 0xCAFEBABE, which is used to determine if it is a Java class file u2 minor_version; // Version number u2 major_version; // Main version number u2 constant_pool_count; Cp_info constant_pool[constant_pool_count-1]; cp_info constant_pool[constant_pool_count-1]; // Constant pool u2 access_flags; / / class and interface levels of access tokens () is obtained by | operation u2 this_class; // Class index (pointing to class constants in the constant pool) u2 super_class; // Parent index (pointing to class constants in the constant pool) u2 interfaces_count; // u2 interfaces[interfaces_count]; U2 fields_count; Field_info fields[fields_count]; U2 methods_count; Method_info methods[methods_count]; // u2 attributes_count; // Number of attributes attribute_info attributes[attributes_count]; / / property sheetCopy the code

} 2.2 Classloader subsystem The classloader subsystem is responsible for finding and loading type information. There are actually two types of Java virtual machine loaders: system loaders and user – defined loaders. The former is part of the Java Virtual machine implementation, while the latter is part of the Java program.

Bootstrap Class loader: It is used to load Java core libraries. It is implemented in native code and does not inherit from java.lang.classLoader. Extensions Class Loader: It is used to load Java extension libraries. The Implementation of the Java Virtual machine provides an extension library directory. The class loader finds and loads Java classes in this directory. Application Class Loader: It loads Java classes based on the CLASSPATH of the Java application. In general, Java application classes are loaded by it. Through this. GetSystemClassLoader () to get it. In addition to the system-provided class loaders, developers can implement their own class loaders by inheriting java.lang.ClassLoader classes to meet special needs.

The classloader subsystem involves several other components of the Java virtual machine as well as classes from the java.lang library. The methods defined by ClassLoader provide an interface for the program to access the ClassLoader mechanism. In addition, for each loaded type, the Java virtual machine creates an instance of the Java.lang. Class Class to represent that type. As with other objects, user-defined Class loaders and instances of Class classes are placed in the heap area of memory, and the loaded type information is located in the method area.

In addition to locating and importing binary class files, the classloader subsystem must be responsible for verifying the correctness of imported classes, allocating and initializing memory for class variables, and resolving symbolic references. These actions also need to be carried out in the following order:

Load (find and load the binary data of the type) join (perform validation: ensure the correctness of the imported type; Preparation: Allocates memory for class variables and initializes them to default values; 2.3 Method area In the Java virtual machine, information about loaded types is stored in memory in a method area. When a virtual machine loads a type, it uses the class loader to locate the corresponding class file, then reads the class file and transfers it to the VIRTUAL machine, which then extracts the type information and stores it in the method area. The method area can also be collected by the garbage collector because the virtual machine allows Java programs to be dynamically extended through user-defined class loaders.

The following information is stored in the method area:

The fully qualified name of the type (such as the fully qualified name java.lang.object) the fully qualified name of the immediate superclass of the type Is the type of the class or the interface type The access modifier of the type (public, abstract, An ordered list of fully qualified names of any direct hyperinterface constant pool of that type (an ordered set, Includes direct constants [string, Integer and Floating point constants] and symbolic references to other types, fields, and methods) field information (field name, type, modifiers) Method information (method name, return type, parameter number and type, modifiers) All class (static) variables except constants A reference to a ClassLoader Class (as each type is loaded, the virtual machine must keep track of whether it was loaded by a bootstrap Class loader or by a user-defined Class loader) a reference to a Class Class (for each loaded type, the virtual machine creates an instance of the Java.lang. Class Class for it accordingly. For example, if you have a reference to an object of the java.lang.Integer class, then simply call the getClass() method of the Integer object reference, 2.4 Heap All Class instances or arrays (an array is a real object in the Java virtual machine) that a Java program creates at run time are placed in the same heap. Because the Java virtual machine instance has only one heap space, all threads will share the heap. Note that the Java virtual machine has an instruction to allocate objects in the heap, but no instruction to free memory, because the virtual machine assigns this task to the garbage collector. The Java Virtual Machine specification does not mandate a garbage collector, only that virtual machine implementations must manage their heap space “somehow.” For example, an implementation may only have a fixed heap size, and when the heap is full, it simply throws an OutOfMemory exception, ignoring the garbage object collection issue, but conforming to the specification.

The Java Virtual Machine specification does not specify how Java objects are represented in the heap, leaving it up to the implementer of the virtual machine to decide how to design them. A possible heap design is as follows:

One handle pool, one object pool. A reference to an object is a local pointer to a handle pool. The advantage of this design is to defragment the heap. When moving objects in the object pool, the handle part only needs to change the new address to which the pointer points. The disadvantage is that each access to an instance variable of an object requires two pointer passes.

Each time a thread is started, the Java virtual machine assigns it a Java stack. A Java stack consists of many stack frames, one of which contains the state of a Java method call. When a thread calls a Java method, the virtual machine pushes a new stack frame into the thread’s Java stack. When the method returns, the stack frame is ejected from the Java stack. The Java stack stores the state of Java method calls in threads — including local variables, parameters, return values, and intermediate results of operations. The Java virtual machine has no registers, and its instruction set uses the Java stack to store intermediate data. The reason for this design is to keep the Java Virtual machine’s instruction set as compact as possible, while also making it easier for the Java Virtual machine to implement on platforms with few general-purpose registers. In addition, the stack-based architecture also facilitates code optimization for dynamic compilers and just-in-time compilers implemented by some virtual machines at runtime.

2.5.1 stack frame

The stack frame consists of local variable area, operand stack and frame data area. When a virtual machine calls a Java method, it gets the method’s local variable area and operand stack size from the type information of the corresponding class, allocates stack frame memory based on this, and pushes it onto the Java stack.

2.5.1.1 Local variable area

Local variable areas are organized into arrays counting from 0 in word length. Bytecode instructions use the data through an index starting at 0. Values of type int, float, Reference, and returnAddress occupy one entry in the array, while values of type byte, short, and char are converted to int values before being stored in the array and occupy one entry. But values of type long and double occupy two consecutive items in the array.

2.5.1.2 Operand stack

Like the local variable area, the operand stack is organized into an array of word lengths. It is accessed through standard stack operations — pushing and unpushing. Because program counters cannot be accessed directly by program instructions, the Java virtual machine’s instructions get operands from the operand stack, so it operates stack-based rather than register-based. The virtual machine uses the operand stack as its workspace, because this is where most instructions pop data, perform operations, and then push the results back into the operand stack.

2.5.1.3 Frame data area

In addition to local variable areas and operand stacks, Java stack frames require frame data areas to support constant pool resolution, normal method returns, and exception distribution mechanisms. Whenever the virtual machine wants to execute an instruction that requires constant pool data, it accesses it through a pointer to the constant pool in the frame data area. In addition to constant pool parsing, the frame data area helps the virtual machine deal with normal or abnormal terminations of Java methods. If it ends normally with a return, the virtual machine must restore the stack frame of the calling method, including setting the program counter to point to the next instruction of the calling method. If the method has a return value, the virtual machine needs to push it onto the operand stack of the calling method. To handle an exception exit during Java method execution, the frame data area also holds a reference to the exception table for this method.

2.6 Program Counters For a running Java program, each thread has its own program counter. Program counters are also called PC registers. A program counter can hold either a local pointer or a returnAddress. When a thread executes a Java method, the value of the program counter is always the address of the next instruction to be executed. The address can be either a local pointer or an offset in the method bytecode relative to the method’s starting instruction. If the thread is executing a local method, the value of the program counter is undefined.

2.7 Local Method Stack Any local method interface uses some kind of local method stack. When a thread calls a Java method, the virtual machine creates a new stack frame and pushes it onto the Java stack. When it calls a local method, the virtual machine keeps the Java stack unchanged and no longer pushes a new stack into the thread’s Java stack. The virtual machine simply connects dynamically and calls the specified local method directly.

The method area and heap are shared by all threads in the virtual machine instance. When the virtual machine loads a class file, it parses the type information from the binary data contained in the class file and places it in the method area. When the program runs, the virtual machine puts all objects created by the program at run time into the heap.

Like other run-time memory areas, the memory area occupied by the local method stack can be dynamically expanded or contracted as needed.

3 Execution engine In the Java Virtual Machine specification, the behavior of the execution engine is defined using instruction sets. The designer of the implementation engine decides how the bytecode is executed, and the implementation can take the form of interpretation, just-in-time compilation, or execution directly using on-chip instructions, or a mixture of them.

An execution engine can be understood as an abstract specification, a concrete implementation, or a running instance. Abstract specifications specify the behavior of the execution engine using an instruction set. Implementations may use a variety of different technologies — including a combination of software, hardware, or tree technologies. The execution engine as a runtime instance is a thread.

Each thread of a running Java program is an instance of a separate virtual machine execution engine. From the beginning to the end of a thread’s life cycle, it is either executing bytecode or local methods.

3.1 instruction set

The bytecode stream of the method consists of a sequence of Instructions from the Java Virtual machine. Each instruction contains a one-byte opcode, followed by zero or more operands. The opcode indicates the operation to be performed. Operands provide the Java virtual machine with additional information needed to execute the opcodes. When a virtual machine executes an instruction, it may use an item in the current constant pool, a value in a local variable of the current frame, or a value at the top of the operand stack of the current frame.

The abstract execution engine executes bytecode instructions one at a time. Each thread (execution engine instance) of the program running in the Java Virtual machine performs this operation. The execution engine retrieves the opcode and, if the opcode has operands, its operands. It performs the action specified by the opcode and the operand that follows, and then retrieves the next opcode. The process of executing bytecode continues until the thread completes, which can be signaled by returning from its original method or by not catching an exception thrown.

4 Native Method Interface The Java Native Interface, also known as JNI (Java Native Interface), is intended for portability. The local method interface allows local methods to do the following:

Pass or return data operation instance variable operation class variable or call class method operation array lock heap object load new class throw exception catch local method call Java method throw exception catch Asynchronous exception thrown by virtual machine indicate garbage collector that an object is no longer needed