JVM bytecode instructions

Part of this article is excerpted from Understanding the Java Virtual Machine

Introduction to the

Java VIRTUAL machine instructions consist of opcodes and operands. An opcode is a one-byte number representing the meaning of a specific operation, and an operand is one or more parameters required for the operation. Because the Java virtual machine has an operand-stack-oriented architecture rather than a registrie-oriented architecture, most instructions do not include operands, only one opcode

Since the JVM opcodes are limited to one byte (0 to 255), this means that the total number of opcodes in the instruction set does not exceed 256. Gave up the compiled Class file format code operand alignment length, so the virtual machine when dealing with more than one byte of data, have to rebuild the specific data in the runtime from byte structure, it will lose some performance, but also omitted a lot of padding and space character, get short compile the code as much as possible

Bytecode and data types

In the Java virtual machine instruction set, most instructions contain information about the data type of their operations, and each data type is represented by a special character. However, the Java virtual machine opcodes are only one byte long, and if every data type-related instruction supported all the Java virtual machine runtime data types, the number of instructions would probably exceed the range represented by a single byte

As a result, the Java VIRTUAL machine provides only a limited number of type-specific instructions for a particular operation to support it, that is, there are not instructions for every data type and every operation. The following table shows the relationship between a particular operation and its supported data types. The T in the instruction can be replaced by the corresponding data type. The space indicates that the operation is not supported for the data type

opcode	byte	short	int	long	float	double	char	reference
Tipush	bipush	sipush
Tconst			iconst	lconst	fconst	dconst		aconst
Tload			iload	lload	fload	dload		aload
Tstore			istore	lstore	fstore	dstore		astore
Tinc			iinc
Taload	baload	saload	iaload	laload	faload	daload	caload	aaload
Tastore	bastore	sastore	iastore	lastore	fastore	dastore	castore	aastore
Tadd			iadd	ladd	fadd	dadd
Tsub			isub	lsub	fsub	dsub
Tmul			imul	lmul	fmul	dmul
Tdiv			idiv	ldiv	fdiv	ddiv
Trem			irem	lrem	frem	drem
Tneg			ineg	lneg	fneg	dneg
Tshl			ishl	lshl
Tshr			ishr	lshr
Tushr			iushr	lushr
Tand			iand	land
Tor			ior	lor
Txor			ixor	lxor
i2T	i2b	i2s		i2l	i2f	i2d
l2T			l2i		l2f	l2d
f2T			f2i	f2l		f2d
d2T			d2i	d2l	d2f
Tcmp				lcmp
Tcmpl					fcmpl	dcmpl
Tcmpg					fcmpg	dcmpg
if_TcmpOP			if_icmpOP					if_acmpOP
Treturn			ireturn	lreturn	freturn	dreturn		areturn

As you can see, most instructions do not support byte, char, short, or Boolean. The compiler will extend byte and short data with signs to the corresponding int data at compile time or run time. Boolean and CHAR data zeros are extended to the corresponding int data, which is then processed using bytecode instructions of the corresponding int type. Therefore, most operations on Boolean, byte, short, and CHAR data are actually converted to int

Load and store instructions

Load and store instructions are used to transfer data back and forth between local variables in a stack frame and the operand stack. These instructions include:

Load a local variable onto the operand stack: ILoad, ILoAD_ < N >, lload, lload_< N >, fload, fload_< N >, dload, dload_< N >, aload, aload_< N >
Store a value from the operand stack to the local variable table: istore, istore_< N >, lstore, lstore_< N >, fstore, fstore_< N >, dstore, dstore_

, astore, astore_< N >
Add a constant to the operand stack: bipush, sipush, LDC, LDC_w, LDC2_W, aconST_NULL, iconST_ML, iconst_< I >, LCONST_ < L >, fCONST_

, dCONST_
Instruction that extends the access index of a local variable table: wide

Some of the instruction mnemonics listed above end in Angle brackets, such as ILoAD_ <n>, and actually represent the ilOAD_0, ILOAD_1, ILOAD_2, and ILOAD_3 instructions. Iload_0 is equivalent to ILoad 0, and ilOAD_1 is equivalent to ILoad 1…… They omit the displayed operands, do not fetch the operands, and are semantically identical to the native generic instructions

Operation instruction

Arithmetic instructions are used to perform a specific operation on the values on two operand stacks and to store the result back to the top of the operand stack. All arithmetic instructions include:

Add instructions: iadd, ladd, fadd, dadd
Subtraction instructions: ISub, LSUB, fsub, dsub
Multiplication instruction: IMul, LMUl, FMUl, dMUl
Division instructions: IDIV, Ldiv, fdiv, ddiv
Redundant instructions: IREM, LREM, frem, DREM
Fetch counter instruction: ineG, Lneg, fNEg, dNEg
Displacement commands: ISHL, ISHR, IUSHR, LSHL, LSHR, LUShr
Bitwise or instruction: IOR, LOR
Bit and instruction: IAND, LAND
Xor instruction by bit: IXOR, LXOR
Local variable increment instruction: iinc
Comparison commands: DCMPG, DCMPL, FCMPG, FCMPL, LCMP

Type conversion instruction

Conversion instructions can convert two different numeric types to each other. These conversion operations are typically used to implement explicit conversion operations in user code, or for the one-to-one correspondence between data type related instructions and data types in the bytecode instruction set mentioned at the beginning

Java supports safe conversions from a small type to a large type, such as int to long, float, or double, as opposed to explicitly using a conversion instruction. These instructions include I2B, I2C, I2S, L2I, F2I, F2L, D2I, D2L, D2F the conversion process may result in loss of numerical accuracy

Object creation and access directives

Although class instances and arrays are both objects, the Java virtual machine uses different bytecode instructions to create and manipulate class instances and arrays. After the object is created, we can use the object access instruction to obtain the fields or array elements in the object or array instance:

Create class instance directive: new
Instructions for creating arrays: newarray, anewarray, multianewarray
Directives that access class fields (static fields, or class variables) and instance fields (non-static fields, or instance variables) : getField, putfield, getStatic, putStatic
The instruction to load an array element into the operand stack: baload, caload, Saload, iaload, laload, faload, daload, aaload
Instructions to store the values of an operand stack in an array element: Bastore, Castore, sastore, iastore, fastore, dastore, aastore
The instruction to take the length of an array: arrayLength
Directives to check class instance types: instanceof, checkcast

Operand stack management instructions

As with the stack in a normal data structure, the Java virtual machine provides instructions for manipulating the operand stack directly, including:

Remove one or two elements from the top of the operand stack: pop, POP2
Copy one or two arrays to the top of the stack and push the copied or double-valued copied values back to the top: dUP, DUP2, DUp_X1, DUp2_X1, DUp_x2, dup2_x2
Swap the top two values of the stack: swap

Control transfer instruction

Control transfer instructions allow the Java VIRTUAL machine to conditionally or unconditionally continue executing the program from the next instruction at the specified location. From the conceptual model, control instructions can be considered as conditional or unconditional modification of the VALUE of the PC register:

Conditional branch: Ifeq, IFLT, IFLE, IFNE, IFGT, IFNULL, IFnonNULL, IF_ICMPEQ, IF_ICMPne, IF_ICMPLt, IF_ICMPGT, IF_ICMPLE, IF_ICMPGE, if_ACMPEq and if_acmpne
Compound condition branches: Tableswitch and LookupSwitch
Unconditional branches: GOTO, GOTO_W, JSR, jSR_W, ret

Method calls and return directives

Method invocation directives are independent of data type, whereas method return directives are differentiated by the type of return value

The Invokevirtual directive: An instance method used to invoke an object, dispatched according to its actual type
Invokeinterface directive: Invokes an interface method, which searches at run time for an object that implements the interface method and finds an appropriate method to invoke
The Invokespecial directive is used to call instance methods that require special processing, including instance initialization methods, private methods, and parent methods
Invokestatic directive: Used to invoke class static methods
Invokedynamic instruction: Used to dynamically resolve the method referenced by the call point qualifier at run time and execute it

Exception handling instruction

In addition to throwing an exception explicitly in a Java program, the Java Virtual Machine specification specifies that many runtime exceptions are automatically thrown when other Java virtual machine instructions detect an exception condition. For catch operations, instead of bytecode instructions, exception tables are used

Synchronization instructions

The Java virtual machine can support method-level synchronization and synchronization of a sequence of instructions within a method, both of which are implemented using a pipe procedure (Monitor, or more commonly called a lock)

Method-level synchronization is implicit and implemented without bytecode instructions, in method calls and return operations. The virtual machine can tell if a method is declared to be synchronized from the ACC_SYNCHRONIZED access flag in the method table structure in the method constant pool. When a method is called, the calling instruction checks to see if the ACC_SYNCHRONIZED access flag of the method is set, and if so, the executing thread succeeds in holding the pipe first. During method execution, the executing thread holds the pipe, and no other thread can retrieve the same pipe. If an exception is thrown during the execution of a synchronized method and cannot be handled within the method, the pipe held by the synchronized method is automatically released when the exception is thrown outside the synchronized method boundary

Synchronizing a sequence of instructions is usually represented by the Synchronized statement block in Java language. The Java VIRTUAL machine has monitorenter and Monitorexit directives to support the semantics of synchronized. The sequence of instructions to be synchronized is wrapped between the two instructions to achieve the synchronization effect