The JVM in full

Class File Overview

  • Java language, cross-platform language,write once, run anywhere
  • Java Virtual machine, cross-language platform, only withThe class fileThe binary file format associated with this feature can be executed on the Java virtual machine, regardless of the language, as long as the source file is compiled into the correct class file

  • Front-end compiler, responsible will meetJava syntax specificationtheJava codeConvert to conformThe JVM specificationBytecode file of
  1. Javac, the front-end compiler provided by default
  2. Lexical parsing, syntactic parsing, semantic parsing, generating bytecode
  3. Full compilation, recompiling everything
  • Each class file corresponds to a unique class or interface, but a class file does not necessarily exist as a disk file. A class file is a set of binary streams in bytes (8 bits)

The order of bytes, the number of bytes, is strictly limited, which byte means what, how long, what order, can’t be changed

  • Clas files store data in a way similar to THE STRUCTURE of C language, with only two data types.Unsigned numbers and tables
  1. Unsigned number is a basic data type that describes numbers, index references, quantity values, or utF-8 encoded strings. U1, U2, U4, and U8 represent unsigned numbers of 1, 2, 4, and 8 bytes
  2. Table, composed of multiple unsigned numbers or other tables as data composition of composite data, to_infoAt the end, data used to describe hierarchical composite structures, which have no fixed length, is preceded by a number description
  3. A class file is essentially a table
  • The bytecode
  1. The source code is compiled to generate one or more bytecode files, one for each class
  2. Bytecode instruction, byA byte - long opcode representing the meaning of a particular operationAnd followed by zero or moreRepresents the operands of the parameters required for this operationConstitute a

structure

  • Website structure
ClassFile { u4 magic; // minor_version; // minor version u2 major_version; // Main version U2 constant_pool_count; Cp_info constant_pool[constant_pool_count-1]; cp_info constant_pool[constant_pool_count-1]; // Constant pool u2 access_flags; // Access identifier, class/interface + modifier U2 this_class; // class index u2 super_class; // Parent index name u2 interfaces_count; U2 interfaces[interfaces_count]; u2 interfaces[interfaces_count]; // index interface array u2 fields_count; Field_info fields[fields_count]; U2 methods_count; Method_info methods[methods_count]; // method table u2 attributes_count; Attribute_info Attributes [attributes_count]; // Attribute_info attributes[attributes_count]; // attribute table}Copy the code
  • Magic number, the first four bits, the first four bytes of every class fileCA FE BA BETo determine whether the file is a valid class file that can be accepted by the virtual machine
  • Version number, 56-bit small version, 78-bit large version,00 00 00 34->52.0JDK8, backward compatibility
  • Access identifier, a two-byte identifier that identifies some class or interface level access information. The end result isThe or result (union) of the following items
  1. The ACC_SUPER class is available by default
  2. Class without ACC_INTERFACE; If so, also ACC_ABSTRACT
  3. If you have ACC_ANNOTATION, you have to have ACC_INTERFACE
  4. Flags: (0x2601) ACC_PUBLIC, ACC_INTERFACE, ACC_ABSTRACT, ACC_ANNOTATION
  5. public class TestDemo4Corresponding to bytecode00 21

  • Index, including class index, parent index, interface index

Symbolic reference to a constant pool

Constant pool

  • Constant pool counter + constant pool table, rich content, resource repository of class files, holding field and method related information
  1. Constant pool counter, counting from 1, the actual length is 1 less, the 0th item is empty, and the data that satisfies some subsequent index value pointing to the constant pool does not refer to any constant pool item under certain circumstances
  2. Constant pool table that stores the constants generated during compilationLiteral and symbolic references, enter the runtime constant pool of the method area after class loading; Red box JKD7 is introduced to reflect the support for dynamic languages; It’s the logotagIndicates the type to store
  3. 3456 Variable modified with final; Byte /short/char/ Boolean, both stored as int, 4 bytes, 32 bits
  4. See the official website for the specific format

Relevant concepts

  • A VM dynamically links to a class file only when it is in a class fileThe final memory layout information for individual methods and fields is not savedSymbolic references to these fields and methods cannot be used directly by virtual machines without conversion. This parameter is required when the VM is runningGets the corresponding symbol reference from the constant poolAnd then inThe parsing phase of the class loading process, replace it with a direct reference,Translate to a specific memory address
  1. Symbolic reference, the use of symbols to describe the referenced target, can be any form of literal, unambiguously located to the target
  2. A direct reference, a pointer directly to the target, a relative offset, or a handle that can be indirectly located to the target, and the virtual machine implements the memory layout under your control
  • literal
  1. Text string
  2. A constant value declared as final
  • Symbolic reference
  1. Fully qualified names of classes and interfaces
  2. The name and descriptor of the field
  3. The name and descriptor of the method
  • Fully qualified name

Full class name com.java.Demo-> fully qualified name com/ Java /Demo; A semicolon

  • Simple name, method or field name
  • Descriptor that describes the data type of the field, the argument list (number, type, and order) of the method, and the return value

//[Ljava.lang.Object;@12a3a380 Object[] objects = new Object[10]; //[[Ljava.lang.String; @29453f44 String[][] strings = new String[10][10]; //[[D@5cad8086 double[][] doubles = new double[10][10];Copy the code

Set of field tables

  • Describes variables declared in an interface or class, includingClass level variables and instance variables
  • The name and data type of the field, both referring to symbolic references in the constant pool, cannot be determined;
  • Points to a collection of constant pool indexes that describe complete information about each field, including field modifiers, access modifiers, and class variablesstatic/ Instance variables, constantsfinal
  • It does not contain fields inherited from the parent class or implementation interface, but it does contain fields that do not exist in the original class. For example, the inner class guarantees access to the external class and holds Pointers to the external class

  • Java fields can not be overloaded, can not be the same name; But in the case of bytecode, the descriptor is different and is legal
  • Field counter indicates the number of fields in the current table
  • Field table, field table each is a table structure, save the complete information of the field

The structure of each field

Access to identify

  1. Field name index, access the constant pool, find the field name
  2. Field descriptor index, field data type
  3. Property sheet collection, property counter + table collection, initial values, comment information, etc., as usedfinalModified variable, stored as a constant

  • Property sheet structure, property name/property length (constant property is always 2)/index of constant value (in constant pool)

Method table collection

  • Points to a collection of constant pool indexes that describe the signature of each method, including method modifiers, return value types, and parameter information
  • It tells you whether it’s abstract or native
  • Methods that are not inherited from a parent class or interface are automatically generated, such as class initialization<clinit>And instance initialization<init>
  • Method overload, simple name same, but return value not included inCharacteristics of the signatureIn, you cannot reload by modifying the return value

Signature, the set of field symbols referenced by each parameter in a method in the constant pool. However, in class bytecode, as long as the descriptors are different, it is legal and can coexist

  • Method table counter, two bytes
  • Method table structure, complete description of methods

Access identity, embellishment

  1. Method name index pointing to the constant pool
  2. Method descriptor index
  3. Property sheet collection, see below, current<init>Method contains a propertycodeBut this code property contains two more properties, the line number table and the local variable table

Property sheet collection

  • A collection of properties following a collection of method tables, referring to information about the class file, its source file name, annotations, used for Java virtual machine validation and running, and debugging
  • A property table in a field table or method table that describes some proprietary information
  • Common format for property sheets

  • Sourcefile Property table structure
SourceFile_attribute { u2 attribute_name_index; u4 attribute_length; // always 2 u2 sourcefile_index; // File name index}Copy the code

Property sheet in the

  • code

  • Bytecode correspondence, where each bytecode instruction corresponds to one byte

  • LineNumberTable Indicates the row number corresponding table
LineNumberTable_attribute { u2 attribute_name_index; // Attribute name u4 attribute_length; // length u2 line_number_table_length; // Row number table length {u2 start_pc; u2 line_number; } line_number_table[line_number_table_length]; }Copy the code
  • Local_variableTable Local variable table
LocalVariableTable_attribute { u2 attribute_name_index; u4 attribute_length; u2 local_variable_table_length; { u2 start_pc; u2 length; // scope length u2 name_index; // Variable name index U2 descriptor_index; // u2 index; } local_variable_table[local_variable_table_length]; }Copy the code

The ultimate resolution

  • The original code
package com.java; public class TestDemo4 { public int num = 1; public int add() { num = num + 2; return num; }}Copy the code

case

Based on the case

  • The source code
public class TestDemo4 { public int num = 1; public int add() { num = num + 2; return num; }}Copy the code
  • The bytecode is decompiled and opened with IDEA, with more default constructors, variable attributions for calls, and full class names
public class TestDemo4 { public int num = 1; public TestDemo4() { } public int add() { this.num += 2; return this.num; }}Copy the code

The Integer case

  • The source code
Integer x = 5; int y = 5; System.out.println(x == y); //true // Is the Integer i1 = 10 fetched in the 'IntegerCache' array; Integer i2 = 10; System.out.println(i1 == i2); //true // Is the new object Integer i3 = 128; Integer i4 = 128; System.out.println(i3 == i4); //falseCopy the code
  • Integer bottom layer, if is inlow(-128)andhigh(127)Between, return directlyIntegerCacheArray, or create a new object
public static Integer valueOf(int i) {
    if (i >= IntegerCache.low && i <= IntegerCache.high)
        return IntegerCache.cache[i + (-IntegerCache.low)];
    return new Integer(i);
}
Copy the code
  • The bytecode
//Integer x = 5; 0 iconst_5 / / 5 loaded into Integer operand stack / / call the valueOf method 1 invokestatic # 2 < Java/lang/Integer. The valueOf: (I) Ljava/lang/Integer; > 4 astore_1 // Store reference type //int y = 5; 5 iconst_5 6 istore_2 // store int 7 getstatic #3 < Java /lang/ system. out: Ljava/ IO /PrintStream; 10 aload_1 > / / x = = y / / remove the reference type Integer objects intValue method, / / call to split open a case 11 invokevirtual # 4 < Java/lang/Integer. IntValue: ()I> 14 iload_2 15 if_icmpne 22 (+7) // Jump to line 22 if equalCopy the code

Polymorphism of case

public class TestDemo3 { public static void main(String[] args) { Father father = new Son(); System.out.println(father.x); } } class Father { int x = 10; public Father() { this.print(); this.x = 20; } public void print() { System.out.println("father" + x); } } class Son extends Father { int x = 30; public Son() { this.print(); this.x = 40; } @Override public void print() { System.out.println("Son" + x); }}Copy the code
  • The output
  1. When initializing son, we initialize the parent class first and call print() in the parent class. However, son overwrites print(), so son’s print method is called. But since x in son class is initialized to zero, then the output is 0
  2. The parent class initializes before the child class initializes. However, when the print method is called, the method overwritten by the child class is directly called. At this time, the child class x is just allocating space and has zero value
Son0 Son30 20 // Member variables are not overwrittenCopy the code