1 source
- Source: Java Virtual Machine JVM Fault Diagnosis and Performance Optimization — Ge Yiming
- Chapter: Chapter 9
This article is some notes of chapter 9.
2 an overview
This article mainly introduces the main composition of Class file, including magic number, version number, constant pool, access flags and so on.
3 Class
Document overview
According to the JVM specification, a Class file can be described very rigorously as:
ClassFile{
u4 magic;
u2 minor_version;
u2 major_version;
u2 constant_pool_count;
cp_info constant_pool[constant_pool_count-1];
u2 access_flags;
u2 this_class;
u2 super_class;
u2 interfaces_count;
u2 interfaces[interfaces_count];
u2 fields_count;
field_info fields[fields_count];
u2 methods_count;
method_info methods[methods_count];
u2 attributes_count;
attribute_info attributes[attributes_count];
}
Copy the code
Each of these fields is described in detail below, in order.
4 the magic number
The Magic Number is a Class flag that tells the JVM that this is a Class file. The Magic Number is a 4-byte unsigned integer, fixed to 0xCAFEBABE. If a Class file does not begin with 0xCAFEBABE, the following error is thrown:
In Linux, you can use vim to open the class file. For example, to open the test. class file, run the following command:
vim -b Test.class :%! xxdCopy the code
Switch to hexadecimal to see the magic number:
5 version
The magic number is followed by the minor version and the major version number of the Class, which indicates the compile time of which the current Class file was generated. Both small and large versions take up two bytes, as shown in the following figure:
0000
This is the minor version number0037
Is the large version number, in decimal notation55
Theta is the correspondingJDK 11
The compile time of the version
6 constant pool
The version number is followed by the number of constant pools and several constant pool entries:
Each constant pool entry has a tag attribute:
The mapping is as follows:
tag
3: Indicates that the type isCONSTANT_Integer
tag
4: Indicates that the type isCONSTANT_Float
CONSTANT_Integer, for example, has the following structure:
CONSTANT_Integer_info {
u1 tag;
u4 bytes;
}
Copy the code
A tag plus a four-byte unsigned integer. Most of the other types are similar and space is limited, see the JVM specification for details.
7 Access Tag
The access token uses two bytes to indicate the access information of the class, such as public/abstract, and the corresponding relationship is as follows:
ACC_PUBLIC
:0x0001
Said,public
classACC_FINAL
:0x0010
, indicating whether the value isfinal
classACC_SUPER
:0x0020
Represents calling a method of the parent class using an enhanced methodACC_INTERFACE
:0x0200
Indicates whether the interface is usedACC_ABSTRACT
:0x0400
Is an abstract classACC_SYNTHETIC
:0x1000
, generated by compile-time class, no source correspondingACC_ANNOTATION
:0x2000
, indicating whether it is a commentACC_ENUM
:0x4000
Is an enumeration
8 Current class, parent class, and interface
The format is as follows:
u2 this_class;
u2 super_class;
u2 interfaces_count;
u2 interfaces[interfaces_count];
Copy the code
Where this_class and super_class are both two-byte unsigned integers pointing to a CONSTANT_Class in the constant pool, representing the current type and its parent class. In addition, because a class can implement multiple interfaces, the index of multiple interfaces needs to be stored as an array. If no interface is implemented, the interfaces_count is 0.
9 fields
The format of the fields is as follows:
u2 fields_count;
field_info fields[fields_count];
Copy the code
Fields_count is an unsigned 2-byte integer with the number of fields followed by the specific field information. Each field is a field_info structure as follows:
field_info {
u2 access_flags; // Access tags, similar to class access tags, can represent public, private, static, and so on
u2 name_index; // A two-byte integer pointing to CONSTANT_Utf8 in the constant pool
u2 descriptor_index; // Also a two-byte integer, used to describe the field type and also pointing to CONSTANT_Utf8 in the constant pool
u2 attributes_count; // Number of attributes
attribute_info attributes[attributes_count]; // Attributes, such as storing initialization values, some comment information, need to use attribute_info
}
attribute_info {
u2 attribute_name_index; // Attribute name, pointing to the constant pool index
u4 attribute_length; // Attribute length
u1 info[attribute_length]; // The information represented by the byte array
}
Copy the code
Methods 10
10.1 Basic structure of the method
The format of the method is as follows:
u2 methods_count;
method_info methods[methods_count];
Copy the code
Each of these method_info structures represents a method:
method_info {
u2 access_flags; // Access the tag, the tag method is public/private, etc
u2 name_index; // The method name, an index pointing to the constant pool
u2 descriptor_index; // The method descriptor, which is also an index to a constant character
u2 attributes_count; // Number of attributes
attribute_info attributes[attributes_count]; Like fields, methods can also carry attributes, a number of attributes + an array of attribute descriptions
}
Copy the code
10.2 Code
attribute
The main content of a method is stored in the attribute. The most important attribute in the attribute is Code, which stores the bytecode and other information of the method. The structure is as follows:
Code_attribute {
u2 attribute_name_index; // Attribute name pointing to the constant pool index
u4 attribute_length; // Attribute length, excluding the first 6 bytes (u2+u4)
u2 max_stack; // Maximum depth of operand stack
u2 max_locals; // The maximum value of the local variable table
u4 code_length; // Bytecode length
u1 code[code_length]; // The bytecode content itself
u2 exception_table_length; // The exception processing table length
{ u2 start_pc; // Four fields represent offsets between start_pc and end_pc
u2 end_pc; // if an exception is encountered from catch_type
u2 handler_pc; // The code jumps to handler_pc
u2 catch_type;
} exception_table[exception_table_length]; / / table
u2 attributes_count;
attribute_info attributes[attributes_count];
}
Copy the code
The Code property itself also contains other properties to further store additional information, including:
LineNumberTable
LocalVariableTable
StackMapTable
10.2.1 LineNumberTable
LineNumberTable records the mapping between bytecode offsets and line numbers. The structure is as follows:
LineNumberTable_attribute {
u2 attribute_name_index; // Index to the constant pool
u4 attribute_length; // Attribute length
u2 line_number_table_length; // Number of entries
{ u2 start_pc; // Bytecode offset
u2 line_number; // The line number of the bytecode offset
} line_number_table[line_number_table_length]; // Table array, each element corresponds to a < start_PC,line_number> tuple
}
Copy the code
10.2.2 LocalVariableTable
This property, also known as the local variable table, records all local variables in a method and is structured as follows:
LocalVariableTable_attribute {
u2 attribute_name_index; // The current attribute name, pointing to the constant pool index
u4 attribute_length; // Attribute length
u2 local_variable_table_length; // Local variable table entry entry
{ u2 start_pc; // Start position of the current local variable
u2 length; // The current local variable length (which can be used to calculate the end position)
u2 name_index; // The local variable name pointing to the constant pool index
u2 descriptor_index; // The type description of the local variable, pointing to the constant pool index
u2 index; // The slot of the local variable in the local variable table of the current stack frame
} local_variable_table[local_variable_table_length];
}
Copy the code
10.2.3 StackMapTable
StackMapTable contains StackMap Frame data, which does not contain the information required by the runtime, and is only used for type verification of Class files. The structure is as follows:
StackMapTable_attribute {
u2 attribute_name_index; // Constant pool index, always "StackMapTable"
u4 attribute_length; // Attribute length
u2 number_of_entries; // The number of stack mapping frames
stack_map_frame entries[number_of_entries]; // Specific stack mapping frame
}
union stack_map_frame { // Each stack mapping frame is defined as an enumerated value as follows
same_frame; See the JVM specification for what each value means
same_locals_1_stack_item_frame; / / https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.7.4
same_locals_1_stack_item_frame_extended;
chop_frame;
same_frame_extended;
append_frame;
full_frame;
}
Copy the code
Each stack map frame is used to describe the data type of the system at a particular bytecode offset, including the type of the local variable table and the type of the operand stack.
Appendix:ASM
Simple to use
ASM is a Java bytecode manipulation library that many well-known libraries rely on, such as AspectJ, CGLIB, and more. But ASM’s performance far exceeds that of high-level bytecode libraries such as CGLIB because ASM is closer to the bottom, more flexible and powerful.
Here is a simple example of using ASM to print Hello World:
package com.company;
import org.objectweb.asm.ClassWriter;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;
public class Main extends ClassLoader implements Opcodes {
public static void main(String[] args) throws Exception{
/ / create the ClassWriter, specify COMPUTE_MAXS and COMPUTE_FRAMES, respectively to calculate maximum local variables and deepest operand stack
ClassWriter cw = new ClassWriter(ClassWriter.COMPUTE_MAXS | ClassWriter.COMPUTE_FRAMES);
// Use ClassWriter to set the basic information of the class, such as the public access tag. The class name is Example
cw.visit(V11,ACC_PUBLIC,"Example".null."java/lang/Object".null);
// Generate the constructor for Example
MethodVisitor mw = cw.visitMethod(ACC_PUBLIC ,"<init>"."()V".null.null);
mw.visitVarInsn(ALOAD,0);
mw.visitMethodInsn(INVOKESPECIAL,"java/lang/Object"."<init>"."()V".false);
mw.visitInsn(RETURN);
mw.visitMaxs(0.0);
mw.visitEnd();
Public static void main(String []args) and bytecode for main()
// Require the runtime to call system.out.println () and print "Hello world" :
mw = cw.visitMethod(ACC_PUBLIC+ACC_STATIC,"main"."([Ljava/lang/String;)V".null.null);
mw.visitFieldInsn(GETSTATIC,"java/lang/System"."out"."Ljava/io/PrintStream;");
mw.visitLdcInsn("Hello world!");
mw.visitMethodInsn(INVOKEVIRTUAL,"java/io/PrintStream"."println"."(Ljava/lang/String;) V".false);
mw.visitInsn(RETURN);
mw.visitMaxs(0.0);
mw.visitEnd();
// Get the binary representation
byte[] code = cw.toByteArray();
Main m = new Main();
// Load the class file into the system, call the 'main()' method through reflection, and print the resultClass<? > mainClass = m.defineClass("Example",code,0,code.length);
mainClass.getMethods()[0].invoke(null.new Object[]{null}); }}Copy the code