Java Virtual Machine Series 1: How are Java files loaded and executed
Java Virtual Machine Series 2: Class bytecode detailed analysis
Java Virtual Machine Series 2: Runtime data area parsing
One. Start from the problem
- What is class bytecode?
- How is bytecode executed by the JVM?
- What are common classes, member variables, methods, local variables?
- How do methods call other methods?
- How does the method return?
- What’s the difference between a stack – based virtual machine and a register – based virtual machine?
The entire analysis is based on class bytecode, which is quite different from The Android dex bytecode. Class bytecodes are organized by classes, whereas dex is a collection of classes
What is class bytecode?
Bytecode is a set of structures defined by the JVM specification to describe the content of the class. Since its description is platform independent of the instruction set, as long as it can translate its own language into bytecode, it can be executed by the virtual machine, such as Java, Groovy, Kotlin, etc. The Hotpot Virtual machine is a standard implementation of the JVM specification.
1. Here’s the simplest example
Seeing is believing, let’s define a test.java class
public class Test {
static String a = "hucaihua";
}
Copy the code
When you compile the test.class and look at its contents in the 010Editor, you can see that it is binary content organized in bytes, which means that it is essentially a binary stream stored in 01 strings. .
In this case, the software is represented in hexadecimal for the sake of demonstration. A hexadecimal system needs to be represented by four bits of binary, so each piece of data separated here represents one byte eight bits.
CA FE BA BE 00 00 00 37 00 14 0A 00 05 00 0F 08
00 10 09 00 04 00 11 07 00 12 07 00 13 01 00 01
61 01 00 12 4C 6A 61 76 61 2F 6C 61 6E 67 2F 53
74 72 69 6E 67 3B 01 00 06 3C 69 6E 69 74 3E 01
00 03 28 29 56 01 00 04 43 6F 64 65 01 00 0F 4C
69 6E 65 4E 75 6D 62 65 72 54 61 62 6C 65 01 00
08 3C 63 6C 69 6E 69 74 3E 01 00 0A 53 6F 75 72
63 65 46 69 6C 65 01 00 09 54 65 73 74 2E 6A 61
76 61 0C 00 08 00 09 01 00 08 68 75 63 61 69 68
75 61 0C 00 06 00 07 01 00 04 54 65 73 74 01 00
10 6A 61 76 61 2F 6C 61 6E 67 2F 4F 62 6A 65 63
74 00 21 00 04 00 05 00 00 00 01 00 08 00 06 00
07 00 00 00 02 00 01 00 08 00 09 00 01 00 0A 00
00 00 1D 00 01 00 01 00 00 00 05 2A B7 00 01 B1
00 00 00 01 00 0B 00 00 00 06 00 01 00 00 00 01
00 08 00 0C 00 09 00 01 00 0A 00 00 00 1E 00 01
00 00 00 00 00 06 12 02 B3 00 03 B1 00 00 00 01
00 0B 00 00 00 06 00 01 00 00 00 02 00 01 00 0D
00 00 00 02 00 0E
Copy the code
The official description of the class structure
We know that a class is a formatted binary stream, that is, its binary stream represents the content in a format specified by the JVM vm specification. In the JVM, class is defined as follows:
ClassFile { u4 magic; // Magic number, fixed value 0xCAFEBABE u2 minor_version; // Version number u2 major_version; // Main version number u2 constant_pool_count; Cp_info constant_pool[constant_pool_count-1]; cp_info constant_pool[constant_pool_count-1]; // Constant pool contents u2 access_flags; //class access identifier u2 this_class; // Current class constant index u2 super_class; // Superclass constant index U2 interfaces_count; // The number of interfaces u2 interfaces[interfaces_count]; U2 fields_count; Field_info fields[fields_count]; U2 methods_count; Method_info methods[methods_count]; // Method content u2 attributes_count; // Number of attributes attribute_info attributes[attributes_count]; // Attribute content}Copy the code
U4 and U2 are description fields of fixed length. U4 is 4 bytes long,u2 is 2 bytes long.
Cp_info, field_info, method_info, and attribute_info are structs, each of which has a separate definition and varies in length.
2. Description of key structures
2.1 CP_info (Description of 18 constant types)
Cp_info is the largest block in class in terms of bytes. It defines 18 constant types in different constructs, each of which uses a tag to indicate its specific type.
The following figure analyzes the actual content in CONSTANT_Class_info. The other content is found the same way.
2.2 field_info (Field Description)
The definition of field_info is as follows:
field_info { u2 access_flags; // Access tag u2 name_index; // Name in the constant index U2 descriptor_index; // The index u2 attributes_count described in constants; // Number of attributes attribute_info attributes[attributes_count]; // Attribute list}Copy the code
- Access_flags access tokens that includes the following nine classes, commonly used such as public, static, private, protected, etc. :
Flag Name Value Interpretation
ACC_PUBLIC 0x0001 Declared public; may be accessed from outside its package.
ACC_PRIVATE 0x0002 Declared private; usable only within the defining class.
ACC_PROTECTED 0x0004 Declared protected; may be accessed within subclasses.
ACC_STATIC 0x0008 Declared static.
ACC_FINAL 0x0010 Declared final; never directly assigned to after object construction (JLS §17.5).
ACC_VOLATILE 0x0040 Declared volatile; cannot be cached.
ACC_TRANSIENT 0x0080 Declared transient; not written or read by a persistent object manager.
ACC_SYNTHETIC 0x1000 Declared synthetic; not present in the source code.
ACC_ENUM 0x4000 Declared as an element of an enum.
Copy the code
2.3 Attribute_info
Attribute description information can be used to describe ClassFile, field_info,method_info, and cod_attribute.
Common things like generics, annotations, etc. fall under attribute_info.
It is defined as follows:
attribute_info { u2 attribute_name_index; / / attribute names such as < Signature > said generics, < RuntimeVisibleAnnotations > said annotation and u4 attribute_length; // Attribute length u1 info[attribute_length]; // Attribute information}Copy the code
The following figure uses an example to illustrate properties
2.4 method_info (Method Description)
The method description is defined as field_info and will not be repeated:
method_info {
u2 access_flags;
u2 name_index;
u2 descriptor_index;
u2 attributes_count;
attribute_info attributes[attributes_count];
}
Copy the code
The biggest difference between a method description (method_info) and a field description (field_info) is that attribute_info is different,
All method_info contains an attribute_info called, which contains the instruction information for the method.
The following figure shows the instruction information in the GET method we defined