The structure of the Class file

When you open a Java file with a hexadecimal compiler, you can see a template for a Java file like this one.

What is a class file?

A Class file is a set of binary streams, based on 8-bit bytes, in which data items are arranged tightly and in strict order without any delimiters. The Java Virtual Machine specification specifies that Class files store data in a c-like pseudo-structure with only two types of data: unsigned numbers and tables, which we will focus on later.

  • Unsigned numbers: Unsigned numbers are basic data types. U1, U2, U4, and U8 represent unsigned numbers of 1, 2, 4, and 8 bytes respectively. They can be used to describe numbers, index references, quantity values, or UTF-8 encoded string values.
  • Table:A table is a compound data type consisting of multiple unsigned numbers or other tables as data items_infoAt the end.

The Romance of JAVA Programmers

0xCAFEBABE

The last four bytes are the version number of the current Class file. The fifth and sixth bytes are the minor version number, and the seventh and eighth bytes are the major version number.

A short story about CAFEBABE

Constant pool

Starting with byte 9, is the entry to the constant pool, which is in the Class file:

  • The data types that are most associated with other items;
  • The data item that occupies the most space in the Class file;
  • The first table type data item to appear in the Class file.

CPC (constant_pool_count) specifies the number of constants in the constant pool. CPC (constant_pool_count) specifies the number of constants in the constant pool. CPC (constant_pool_count) specifies the number of constants in the constant pool. That is to say if the CPC = 22, on behalf of the 21 items in a constant pool constant, the index value of 1 ~ 21, 0 constants were empty, in order to satisfy some point to the constant pool behind the index values of the data in certain circumstances to express “no reference a constant pool project”, will let the index value to zero.

The constant pool records all tokens (class names, member variable names, etc., which we will change next) and symbolic references (method references, member variable references, etc.) that have appeared in the code, including the following two types of constants:

  • Literal:Close to the Java language level, constants include
    • Text string
    • A constant value declared as final
  • Symbolic reference:A set of symbols describing the referenced object, including
    • Fully qualified names of classes and interfaces
    • The name and descriptor of the field
    • The name and descriptor of the method

Each constant in the constant pool is stored through a table. There are currently 14 constants, but the trouble is that each of these 14 constant types has its own structure, and we’ll cover only two in detail here: CONSTANT_Class_info and CONSTANT_Utf8_info.

CONSTANT_Class_info has the following storage structure:

. [ tag=7] [ name_index ] ... . [1[a]2A]...Copy the code

Tag = 7 indicates that the following table is a CONSTANT_Class_info, name_index is an index value, An index that points to a constant of type CONSTANT_Utf8_info in the constant pool. CONSTANT_Utf8_info constants are typically used to describe class fully qualified names, method names, and field names. Its storage structure is as follows:

. [ tag=1] [length of current constant len] [string value of symbolic reference of constant]... . [1[a]2Bit [len bit]...Copy the code