Part of this article is excerpted from Understanding the Java Virtual Machine

Introduction to the

Java VIRTUAL machine instructions consist of opcodes and operands. An opcode is a one-byte number representing the meaning of a specific operation, and an operand is one or more parameters required for the operation. Because the Java virtual machine has an operand-stack-oriented architecture rather than a registrie-oriented architecture, most instructions do not include operands, only one opcode

Since the JVM opcodes are limited to one byte (0 to 255), this means that the total number of opcodes in the instruction set does not exceed 256. Gave up the compiled Class file format code operand alignment length, so the virtual machine when dealing with more than one byte of data, have to rebuild the specific data in the runtime from byte structure, it will lose some performance, but also omitted a lot of padding and space character, get short compile the code as much as possible

Bytecode and data types

In the Java virtual machine instruction set, most instructions contain information about the data type of their operations, and each data type is represented by a special character. However, the Java virtual machine opcodes are only one byte long, and if every data type-related instruction supported all the Java virtual machine runtime data types, the number of instructions would probably exceed the range represented by a single byte

As a result, the Java VIRTUAL machine provides only a limited number of type-specific instructions for a particular operation to support it, that is, there are not instructions for every data type and every operation. The following table shows the relationship between a particular operation and its supported data types. The T in the instruction can be replaced by the corresponding data type. The space indicates that the operation is not supported for the data type

opcode byte short int long float double char reference
Tipush bipush sipush





Tconst

iconst lconst fconst dconst
aconst
Tload

iload lload fload dload
aload
Tstore

istore lstore fstore dstore
astore
Tinc

iinc




Taload baload saload iaload laload faload daload caload aaload
Tastore bastore sastore iastore lastore fastore dastore castore aastore
Tadd

iadd ladd fadd dadd

Tsub

isub lsub fsub dsub

Tmul

imul lmul fmul dmul

Tdiv

idiv ldiv fdiv ddiv

Trem

irem lrem frem drem

Tneg

ineg lneg fneg dneg

Tshl

ishl lshl



Tshr

ishr lshr



Tushr

iushr lushr



Tand

iand land



Tor

ior lor



Txor

ixor lxor



i2T i2b i2s
i2l i2f i2d

l2T

l2i
l2f l2d

f2T

f2i f2l
f2d

d2T

d2i d2l d2f


Tcmp


lcmp



Tcmpl



fcmpl dcmpl

Tcmpg



fcmpg dcmpg

if_TcmpOP

if_icmpOP



if_acmpOP
Treturn

ireturn lreturn freturn dreturn
areturn

As you can see, most instructions do not support byte, char, short, or Boolean. The compiler will extend byte and short data with signs to the corresponding int data at compile time or run time. Boolean and CHAR data zeros are extended to the corresponding int data, which is then processed using bytecode instructions of the corresponding int type. Therefore, most operations on Boolean, byte, short, and CHAR data are actually converted to int

Load and store instructions

Load and store instructions are used to transfer data back and forth between local variables in a stack frame and the operand stack. These instructions include:

  • Load a local variable onto the operand stack: ILoad, ILoAD_ < N >, lload, lload_< N >, fload, fload_< N >, dload, dload_< N >, aload, aload_< N >
  • Store a value from the operand stack to the local variable table: istore, istore_< N >, lstore, lstore_< N >, fstore, fstore_< N >, dstore, dstore_

    , astore, astore_< N >
  • Add a constant to the operand stack: bipush, sipush, LDC, LDC_w, LDC2_W, aconST_NULL, iconST_ML, iconst_< I >, LCONST_ < L >, fCONST_

    , dCONST_

  • Instruction that extends the access index of a local variable table: wide

Some of the instruction mnemonics listed above end in Angle brackets, such as ILoAD_ <n>, and actually represent the ilOAD_0, ILOAD_1, ILOAD_2, and ILOAD_3 instructions. Iload_0 is equivalent to ILoad 0, and ilOAD_1 is equivalent to ILoad 1…… They omit the displayed operands, do not fetch the operands, and are semantically identical to the native generic instructions

Operation instruction

Arithmetic instructions are used to perform a specific operation on the values on two operand stacks and to store the result back to the top of the operand stack. All arithmetic instructions include:

  • Add instructions: iadd, ladd, fadd, dadd
  • Subtraction instructions: ISub, LSUB, fsub, dsub
  • Multiplication instruction: IMul, LMUl, FMUl, dMUl
  • Division instructions: IDIV, Ldiv, fdiv, ddiv
  • Redundant instructions: IREM, LREM, frem, DREM
  • Fetch counter instruction: ineG, Lneg, fNEg, dNEg
  • Displacement commands: ISHL, ISHR, IUSHR, LSHL, LSHR, LUShr
  • Bitwise or instruction: IOR, LOR
  • Bit and instruction: IAND, LAND
  • Xor instruction by bit: IXOR, LXOR
  • Local variable increment instruction: iinc
  • Comparison commands: DCMPG, DCMPL, FCMPG, FCMPL, LCMP

Type conversion instruction

Conversion instructions can convert two different numeric types to each other. These conversion operations are typically used to implement explicit conversion operations in user code, or for the one-to-one correspondence between data type related instructions and data types in the bytecode instruction set mentioned at the beginning

Java supports safe conversions from a small type to a large type, such as int to long, float, or double, as opposed to explicitly using a conversion instruction. These instructions include I2B, I2C, I2S, L2I, F2I, F2L, D2I, D2L, D2F the conversion process may result in loss of numerical accuracy

Object creation and access directives

Although class instances and arrays are both objects, the Java virtual machine uses different bytecode instructions to create and manipulate class instances and arrays. After the object is created, we can use the object access instruction to obtain the fields or array elements in the object or array instance:

  • Create class instance directive: new
  • Instructions for creating arrays: newarray, anewarray, multianewarray
  • Directives that access class fields (static fields, or class variables) and instance fields (non-static fields, or instance variables) : getField, putfield, getStatic, putStatic
  • The instruction to load an array element into the operand stack: baload, caload, Saload, iaload, laload, faload, daload, aaload
  • Instructions to store the values of an operand stack in an array element: Bastore, Castore, sastore, iastore, fastore, dastore, aastore
  • The instruction to take the length of an array: arrayLength
  • Directives to check class instance types: instanceof, checkcast

Operand stack management instructions

As with the stack in a normal data structure, the Java virtual machine provides instructions for manipulating the operand stack directly, including:

  • Remove one or two elements from the top of the operand stack: pop, POP2
  • Copy one or two arrays to the top of the stack and push the copied or double-valued copied values back to the top: dUP, DUP2, DUp_X1, DUp2_X1, DUp_x2, dup2_x2
  • Swap the top two values of the stack: swap

Control transfer instruction

Control transfer instructions allow the Java VIRTUAL machine to conditionally or unconditionally continue executing the program from the next instruction at the specified location. From the conceptual model, control instructions can be considered as conditional or unconditional modification of the VALUE of the PC register:

  • Conditional branch: Ifeq, IFLT, IFLE, IFNE, IFGT, IFNULL, IFnonNULL, IF_ICMPEQ, IF_ICMPne, IF_ICMPLt, IF_ICMPGT, IF_ICMPLE, IF_ICMPGE, if_ACMPEq and if_acmpne
  • Compound condition branches: Tableswitch and LookupSwitch
  • Unconditional branches: GOTO, GOTO_W, JSR, jSR_W, ret

Method calls and return directives

Method invocation directives are independent of data type, whereas method return directives are differentiated by the type of return value

  • The Invokevirtual directive: An instance method used to invoke an object, dispatched according to its actual type
  • Invokeinterface directive: Invokes an interface method, which searches at run time for an object that implements the interface method and finds an appropriate method to invoke
  • The Invokespecial directive is used to call instance methods that require special processing, including instance initialization methods, private methods, and parent methods
  • Invokestatic directive: Used to invoke class static methods
  • Invokedynamic instruction: Used to dynamically resolve the method referenced by the call point qualifier at run time and execute it

Exception handling instruction

In addition to throwing an exception explicitly in a Java program, the Java Virtual Machine specification specifies that many runtime exceptions are automatically thrown when other Java virtual machine instructions detect an exception condition. For catch operations, instead of bytecode instructions, exception tables are used

Synchronization instructions

The Java virtual machine can support method-level synchronization and synchronization of a sequence of instructions within a method, both of which are implemented using a pipe procedure (Monitor, or more commonly called a lock)

Method-level synchronization is implicit and implemented without bytecode instructions, in method calls and return operations. The virtual machine can tell if a method is declared to be synchronized from the ACC_SYNCHRONIZED access flag in the method table structure in the method constant pool. When a method is called, the calling instruction checks to see if the ACC_SYNCHRONIZED access flag of the method is set, and if so, the executing thread succeeds in holding the pipe first. During method execution, the executing thread holds the pipe, and no other thread can retrieve the same pipe. If an exception is thrown during the execution of a synchronized method and cannot be handled within the method, the pipe held by the synchronized method is automatically released when the exception is thrown outside the synchronized method boundary

Synchronizing a sequence of instructions is usually represented by the Synchronized statement block in Java language. The Java VIRTUAL machine has monitorenter and Monitorexit directives to support the semantics of synchronized. The sequence of instructions to be synchronized is wrapped between the two instructions to achieve the synchronization effect