This is the 13th day of my participation in the August More Text Challenge
An understanding of Java is essential for Android developers, and while kotlin is the current push, the underlying story is the same. The knowledge of Java address can help to complete the development work and solve problems.
1. The cornerstone of irrelevance
The Java virtual machine is not bound to any language, including Java, but is only associated with a specific binary file format called a “Class file,” which contains the Java Virtual machine instruction set and symbol table, as well as several other auxiliary information. Any functional language can be represented as a valid Class file that can be accepted by the Java virtual machine, so the virtual machine does not care what language the Class comes from
2. Structure of the Class file
Each Class file corresponds to a unique Class or interface definition, but on the other hand, a Class or interface does not have to be defined in a file (for example, a Class or interface can also be generated directly through the Class loader).
- A Class file is a set of binary streams based on 8-bit bytes. Each data item is arranged in a Class file in a strict and compact order. There are no separators in the middle
- When encountering data items that need to occupy more than 8-bit space, it will be divided into several 8-bit bytes for storage according to the relaxation of the highest order
- The Class file format uses a pseudo-structure similar to the V language structure to store data, with only biweekly data types: unsigned, table
- Unsigned number is a basic data type. U1, U2, U4, and U8 represent unsigned numbers of 1 byte, 2 byte, 4 byte, and 8 byte respectively. Unsigned numbers are used to describe numbers, index references, quantity values, and string values formed according to UTF-8bianma
- A table is a compound data type consisting of multiple unsigned numbers, or other tables, as data items, all of which habitually end with “_info”
- The entire Class file will essentially be a table
- Because the Class file does not set any intervals, all data items are strictly defined
##2.1 Version of magic numbers and Class files
- The first four bytes of the Class file are magic numbers whose only function is to determine whether the file is acceptable to the virtual machine. The magic number is fixed at 0xCAFEBABE for coffee
- The four bytes after the magic number are respectively the version number and the major version number. The version number must be in the executable version of the VM before the Class file can be executed by the VM
2.2 constant pool
The constant pool is the repository of resources in the Class file. It is the data type that is most associated with other items in the Class file structure. It is also one of the data items that occupy the largest space in the Class file
- The number of constants in the constant pool is not fixed, so the constant pool needs to place a u2 value that represents the value of the constant pool capacity (starting from 1, or 22, representing 21 constants).
The constant pool mainly stores two types of constants:
- Literals: Close to the Java language level of constant concepts, such as text strings, constant values declared final, etc
- Symbolic reference: a concept of compiler principle that includes three types of constants: fully qualified names of classes and interfaces, field names and descriptors, and method names and descriptors
Java code is not “wired” for Javac compilation like C and C++, but dynamically linked when the virtual machine loads the Class file. Even though Class files do not hold the final memory layout of each method and field, symbolic references to these fields and methods cannot be used by the virtual machine without running time conversion. When the virtual machine is running, symbolic references need to be retrieved from the constant pool and parsed and translated into specific memory addresses at class creation or runtime.
2.3 Access Flags
The next two bytes after the constant pool represent the access flag, which is used for some Class or interface level access information, including: whether the Class is a Class or interface; Whether it is defined as public; Whether to define an abstract type; If it is a class, whether it is declared final, etc.
2.4 Set of class indexes, parent indexes, and interface indexes
Both the Class index and the parent index are u2-type data, while the interface index set is a set of U2-type data. The Class file uses these three data to determine the inheritance relationship of this Class. The class index is used to determine the fully qualified name of the class, and the superclass index is used to determine the fully qualified name of the parent of the class. Because all Java classes have at most one parent class, there is only one parent index. A set of interface indexes that describe the interfaces implemented by the class. The implemented interfaces are sorted from left to right in the order after the implements statement.
2.5 Collection of field tables
(filed_info) field table describes variables declared in an interface or class.
- Fields include class-level variables as well as instance-level variables, but not local variables declared inside methods.
- Description can include scope of fields (public, private..) , whether it is an instance variable or a class variable (static modifier), final, and so on
2.6 Collection of method tables
- The Class file storage format describes both methods and fields
- The Java Code in the method, compiled by the compiler into bytecode instructions, is stored in a property called “Code” in the collection of method familiarity tables.
2.6 Property table Set
You can carry your own set of property tables in Class files, field tables, and method tables to describe information specific to certain scenarios