Why write this article

I have been asked by readers how to debug javac source code, and I have read a lot of Javac source code in the process of writing JVM nuggets. There are also few articles on the web in this area, so LET me write a javac source code debugging article, as a series of articles javac.

Javac source debugging process is relatively simple, it is written in the Java language itself, we understand the internal logic is more friendly.

Setting up process

Environment Note: Intellij and JDK8

1. The first step is to download and import the source code of Javac

If you don’t want to download from openJDK, you can skip step 1 and download directly from my Github: github.com/arthur-zhan…

Its way of downloading is: open hg.openjdk.java.net/jdk8/jdk8/l… Click zip or gz on the left to download.

Create a new javac-source-code-reading project in Intellij, copy the entire SRC /share/classes/com directory from the source directory to the project SRC directory, and delete the useless Javadoc directory.

2. Find the javac main function entry

Code in the SRC/com/sun/tools/javac/Main Java

Running the main function, since there is no source code path to compile, should not surprise you with the following output on the console

Create a new helloWorld.java file and add the absolute path to helloWorld.java in Program Arguments to the startup configuration.

Running main.java again generates the HelloWorld.class file in the same directory as helloWorld.java.

3. Add breakpoints

If you hit a breakpoint in main.java and start debugging, you’ll find that no matter how you set it, debugging will go to Tool.jar instead of the source code you just imported.

Intellij displays the decompilated tools.jar source code, which is not as readable as the source code.

Open the Project Structure page (File->Project Structure), check the Dependencies TAB in the figure, and change the

order to the top of the Project JDK:

Debugging again is ready to enter the breakpoint in the project’s source code.

Javac Bytecode Case 1: Tableswitch and lookupswitch Selected policies

Why does the following code compile a switch-case statement that uses lookupswitch instead of tableswitch? Instead of saying “If the case values are compact and there are a few or no faults in the middle, Will tablesWitch be used to implement switch-case?

public static void foo(a) {
    int a = 0;
    switch (a) {
        case 0:
            System.out.println("# 0");
            break;
        case 1:
            System.out.println("# 1");
            break;
        default:
        System.out.println("default");
            break; }}Copy the code

Corresponding bytecode

public static void foo();
 0: iconst_0
 1: istore_0
 2: iload_0
 3: lookupswitch  { // 2
               0: 28
               1: 39
         default: 50
    }
Copy the code

This question is interesting, mainly tableswitch and lookupswitch cost estimate, the code in the SRC/com/sun/tools/javac/JVM/Gen. In Java

In the case of 0 and 1

hi=1 lo=0 nlabels = 2 // table_space_cost = 4 + (1 - 0 + 1) = 6 long table_space_cost = 4 + ((long) hi - lo + 1); // words // table_time_cost = 3 long table_time_cost = 3; // comparisons // lookup_space_cost = 3 + 2 * 2 = 7 long lookup_space_cost = 3 + 2 * (long) nlabels; // lookup_time_cost = 2 long lookup_time_cost = nlabels; // table_space_cost + 3 * table_time_cost = 6 + 3 * 3 = 15 // lookup_space_cost + 3 * lookup_time_cost = 7 + 3 * 2 = 13 // opcode = 15 < = 13? tableswitch : lookupswich int opcode = nlabels > 0 && table_space_cost + 3 * table_time_cost < = lookup_space_cost + 3 * lookup_time_cost ? tableswitch : lookupswitch;Copy the code

So in case of only 0 and 1, the cost is calculated as table_space_cost + 3 * table_time_cost > lookup_space_cost + 3 * lookup_time_cost, Lookupswich is cheaper. Choose Lookupswich

What if I have 0, 1,2?

hi=2 lo=0 nlabels = 3 // table_space_cost = 4 + (2 - 0 + 1) = 7 long table_space_cost = 4 + ((long) hi - lo + 1); // words // table_time_cost = 3 long table_time_cost = 3; // comparisons // lookup_space_cost = 3 + 2 * 3 = 9 long lookup_space_cost = 3 + 2 * (long) nlabels; // lookup_time_cost = 3 long lookup_time_cost = nlabels; // table_space_cost + 3 * table_time_cost = 7 + 3 * 3 = 16 // lookup_space_cost + 3 * lookup_time_cost = 9 + 3 * 3 = 18 // opcode = 16 < = 18? tableswitch : lookupswich int opcode = nlabels > 0 && table_space_cost + 3 * table_time_cost < = lookup_space_cost + 3 * lookup_time_cost ? tableswitch : lookupswitch;Copy the code

So in case of only 0, 1,2, the cost is calculated as table_space_cost + 3 * table_time_cost < lookup_space_cost + 3 * lookup_time_cost, Tableswitch Lower cost Tableswitch

In a very small number of cases, there’s not much difference between the two, except that javac’s algorithm here leads to the choice of lookupSwitch

Javac looks at bytecode case 2: bytecode instruction selection for loading integers onto the stack

We know that there are many instructions to load integers on the stack, such as iconst_0, bipush, sipush, LDC, but how do they choose?

public static void foo() { int a = 0; int b = 6; int c = 130; int d = 33000; } the corresponding bytecode 0: iconst_0 1: istore_0 2: bipush 6 4: istore_1 5: sipush 130 8: istore_2 9: LDC #2 // int 33000 11: istore_3Copy the code

In the com/sun/tools/javac/JVM/Items. Java load () function with the breakpoint

You can see the selected policies in descending order:

  • – Select iconst_n from 1 to 5
  • – Select bipush from 128 to 127
  • – The value ranges from 32768 to 32767
  • Select LDC for other large integers

This is consistent with the bytecode instruction documentation in the Java Virtual Machine specification.

Afterword.

Use Javac to discover a lot of interesting things, hope you can find better fun things in the comments.