JVM base class file structure

Java can be “compiled once, run anywhere” because the Jvm is customized for each operating system and platform. Second, fixed-format bytecode (.class) files can be compiled and generated for use by the JVM, regardless of platform.

public class Test { public int math() { int a = 1; int b = 2; int c = (a + b) * 10; return c; } public static void main(String[] args) { Test test = new Test(); test.math(); }}Copy the code

Compile to a. Class file and open it with the UE editor

The above seemingly messy hexadecimal bytecode has a certain specification. The JVM specification requires that each bytecode file be made up of ten parts in a fixed order, as shown below.

1. The magic number

The first four bytes of each class file are called Megic numbers.

CA FE BA BE

Its sole purpose is to determine whether the file is a Class file acceptable to the virtual machine. The magic number is fixed to 0xCAFEBABE. The magic number is placed at the beginning of the file, which the JVM can use to determine if the file is likely to be a.class file, and if so, proceed with subsequent operations.

Many file storage standards use magic numbers for identification, such as image formats such as GIF or JPG that have magic numbers in their headers. The use of magic numbers rather than extensions for identification is mainly for security reasons, since file extensions can be changed at will.

2. The version number

The version number is 4 bytes after the magic number

00 00 00 34

The first two bytes represent the Minor Version and the last two bytes represent the Major Version.

3. The constant pool

The byte immediately after the major version number is the constant pool entry. The constant pool stores two types of constants: literals and symbolic applications.

Literals: text strings (” ABC “), values of primitive data types (1,1.0)
Symbolic References
1. Fully Qualified Name of class and interface
2. Field name and Descriptor
3. The name and descriptor of the method

Class information can be thought of as a framework, and a constant pool holds concrete data. Since either the class name or the field name or method name is stored in the constant pool, their corresponding positions are only stored in the offset in the constant pool.

Constant pool composition

The constant pool is divided into two parts as a whole: the constant pool counter and the constant pool data area, as shown below:

1) Constant pool counter (constant_pool_count) : Since the number of constant pools is not fixed, two bytes need to be placed to represent the constant pool capacity count. The constant pool counter for the example code above is “0014”, which, when converted to decimal, yields 20, excluding the subscript 0, meaning that the class file has 19 constants.

2) Constant pool data area: The data area is composed of (constant_pool_count-1) cp_INFO structures, one cp_INFO structure corresponds to one constant. There are 14 types of cp_info in bytecode, and the structure of each type is fixed, as shown in the following table:

4. Access identification

Two bytes after the constant pool ends, describing whether the Class is a Class or an interface, and whether it is modified by Public, Abstract, Final, etc. The JVM specification specifies eight access flags, as shown in the following table:

It is important to note that all the access token the JVM is not exhaustive, but use, is the bitwise or operator to described, such as a class of modifier to public final, is the value of the corresponding access modifiers for ACC_PUBLIC | ACC_FINAL, Namely 0 x0001 | 0 x0011 x0010 = 0

5. Current index

The two bytes after the access flag describe the fully qualified name of the current class. These two bytes hold the value of the index in the constant pool, from which the fully qualified name of the class can be found.

6. Parent index

The last two bytes of the current class name, describing the fully qualified name of the parent class. These two bytes hold values that are also indexed in the constant pool, where the fully qualified name of the class’s parent can be found.

7. Interface index

The two bytes following the name of the parent class describe the class’s interface counter, that is, the number of interfaces implemented by the current class or its parent. The next n bytes are the index values of all the string constants of the interface names in the constant pool.

Table 8. Field

Field tables are used to describe variables declared in classes and interfaces, including class-level variables and instance variables, but not local variables declared inside methods. The field table is also divided into two parts

The first part is two bytes, describing the number of fields
The second part is field_info with details for each field. The structure of the field table is as follows:

9. The method table

After the field table, the method table is also composed of two parts

The first part is two bytes describing the number of methods
The second section provides details for each method. Includes: method access flag, method name, method descriptor, and method properties

10. Additional attributes

The last part of the bytecode that holds the basic information about the attributes defined by the class or interface in the file.

Reference Documents:

Class bytecode File Structure

Java VIRTUAL Machine