preface

From the beginning of my contact with The Java language, I have heard that Java is an interpreted language, as for what is interpreted language, I do not know. Just write your business code into a.java file, compile it into a bunch of.class files from the IDE, package it into a.jar file, and throw it on the server. What does it matter whether Java is a compiled or interpreted language? Running a problem, then solve it, I just want to go to work to touch fish, riding my beloved hand mountain bike to measure the Qinhuai River wheel after work, this is a comfortable life ah.

That being said, it doesn’t stop the crazy and endless introspection. Today I will ask you the difference between ArrayList and LinkedList, tomorrow I will ask you the source code of HashMap, and the day after tomorrow I will ask you the Thread. After all, this is only the JDK source level volume, after all, Java for a living, this is understandable. Until, that is, volatile, synchronized, not to mention The Java garbage collector, which is where the volume gets stuck, and eventually the Java implementation mechanism gets stuck.

First I turn my heart toward the moon, but the moon shines on the ditch.

To prepare

Now that you’re ready to debug the JVM, let’s pour cold water on it. The entire JVM is written with a small amount of C and a large amount of C++, without some C/C++ basics such as basic syntax and more advanced Pointers. If you don’t understand multilevel Pointers and function Pointers, you should forget about them. Have this time to read more interview questions, brush algorithm questions is not good, even play a game of pesticide, why spend time to suffer this pain. Also, even if you get the source code environment done, and the result is that you get the source code done, the popularity passes, and you are attracted by other trendy technologies.

However, if you want to compile your own JDK, you can. For example, if we want to read the source code of HashMap, we can open the source code and see a bunch of English annotations. At this time, if we want to write our own understanding in Chinese into the source code, we need to compile our own JDK, can not understand C/C++, but can understand Java. It’s a personal choice.

Why do I choose OpenJDK12? At the beginning, I chose OpenJDK8, and I passed all the compilation and everything, but I couldn’t write Chinese annotations in the JDK source code. It really took a lot of effort, but vegetable fish is vegetable fish, I really didn’t fix it. OpenJDK11 was chosen, and it worked. For a long time, just touched the threshold, because the PC upgrade, incidentally switched to OpenJDK12 version. After the version did not chase, but tried to compile, have been successful.

Download the source code

I do not recommend going to Git and clone the source code directly, for the simple reason that this is a large project, you will never see the speed of the update speed, rather than directly download a distribution of the source code, build your own fun. Jdk.java.net/ is an entry point for OpenJDK distributions

Select a version, click into the red box below, the red box inside, is the whole set of source code, directly click download OK.

Build environment

After unpacking the source code, there is a doc directory under the root directory. Once inside, there is a building.html file, which details the environment and tools required for the build. Some people do not recommend this thing under Windows, in fact, I do not recommend vegetable fish, next, vegetable fish introduce vegetable fish my own difficult journey.

At the beginning, I made a Windows virtual machine in the Windows system, in order to avoid the damage of the physical machine. Cygwin is installed as described in build.html. In the process of installing Cygwin, make, Autoconf, zip and unzip components are installed. Finally, I installed visual studio 2015, which is the VC++ IDE for Windows. Finally, on the command line provided by Cygwin, run bash configure.. . Once configured, execute Make Images to build our own JVM and JDK. Simple a few words, I once made the dish fish abruptly two days, virtual machine reinstalled three times. Finally, the next best thing is to use the Source Insight tool on the physical machine to read the Source code, then push it to the virtual machine to compile it, and then copy the JDK to the physical machine to debug it using the logs. The whole process is tedious and takes a long time to compile, so it is gradually forgotten. Except for vegetable fish, I could not debug, and VEGETABLE fish did not record the process of setting up the environment under Windows. So, now, vegetable fish I only retain the Windows compiler environment VIRTUAL machine, interested in looking for vegetable fish oh.

By accident, I switched to a MAC and picked up the source code again. The next step is to record the process of building OpenJDK under MAC.

The JVM is written in C/C++, and you must have a compiler that supports the language. For Linux, GCC /g++ is used, for Windows, VC++ is used, and for MAC, Xcode is downloaded directly. C/C++ JDK is used to build the JVM. Rely on this Boot JDK, we do not consider the chicken or egg problem, ready to this environment is OK. In addition, the Boot JDK has another requirement, Boot JDK version must be lower than the version you are building. These are all described in building.html, and I won’t go into too much about vegetable fish.

Build and debug

Configuration and Build

Go to the source code root and run bash configure to start the pre-build configuration and checking. For example, check whether the Makefile tool exists, whether the Boot JDK exists, and whether the version of the Boot JDK complies with the rules. By the way, on Linux, tools such as MySQL and Nginx require configuration checks before they are built, and they all have a similar configure file underneath their source code. As for the command prefix bash, which is an interpreter for the shell, bash is used by default, and there are other examples such as sh, ZSH, etc., no more verbose.

There is a problem with running configuration check:

What is this saying? –with-boot-jdk (OpenJDK11) –with-boot-jdk (OpenJDK12) –with-boot-jdk (openjd11) –with-boot-jdk (openJD12) I have multiple versions of the JDK installed on my machine, and this error occurred because I did not specify which version to use. That is simple, specify not OK. Rerun:

bash configure --with-boot-jdk='/Users/opensoftware/custome-jdk-11'
Copy the code

When the configuration check is complete, the following log appears on the screen


====================================================
A new configuration has been successfully created in
/Users/xxx/openjdk/build/macosx-x86_64-server-release
using configure arguments '--with-boot-jdk=/Users/opensoftware/custome-jdk-11'.

Configuration summary:
* Debug level:    release
* HS debug level: product
* JVM variants:   server
* JVM features:   server: 'aot cds cmsgc compiler1 compiler2 dtrace epsilongc g1gc graal jfr jni-check jvmci jvmti management nmt parallelgc serialgc services shenandoahgc vm-structs'
* OpenJDK target: OS: macosx, CPU architecture: x86, address length: 64
* Version string: 12-internal+0-adhoc.username.openjdk (12-internal)

Tools summary:
* Boot JDK:       openjdk version "11-internal" 2018-09-25 OpenJDK Runtime Environment (build 11-internal+0-adhoc.username.openjdk11) OpenJDK 64-Bit Server VM (build 11-internal+0-adhoc.username.openjdk11, mixed mode)  (at /Users/opensoftware/custome-jdk-11)
* Toolchain:      clang (clang/LLVM from Xcode 12.5.1)
* C Compiler:     Version 12.0.5 (at /usr/bin/clang)
* C++ Compiler:   Version 12.0.5 (at /usr/bin/clang++)

Build performance summary:
* Cores to use:   16
* Memory limit:   16384 MB

Copy the code

Is, they have been/Users/XXX/its/build/macosx – x86_64 – server – release the directory below to create a new configuration, Using the configuration parameters are ‘- with – the boot – JDK = / Users/opensoftware/custome – JDK – 11’, under a pile of configuration index, noteworthy is the Debug level: Release means we’re going to configure release this time and turn off some debugging information. Now, to start building the JVM and JDK, the command is simple:

make images
Copy the code

Then it went wrong again:

The complete code looks like this:

// src/hotspot/share/runtime/sharedRuntime.cpp:2873:85
double locs_buf[20];
buffer.insts() - >initialize_shared_locs((relocInfo*)locs_buf, sizeof(locs_buf) / sizeof(relocInfo));
Copy the code

Sizeof is a C operator (keyword) that calculates the sizeof a variable, not a function, similar to synchronized(obj) in Java. The reason for this problem is to press no tables, just to know that this is a warning in C++. Because of the strict checking mechanism, the warning turns into an error, which is decided internally by OpenJDK. In C++, the usual way to suppress warnings is to use #program warning(disable XXX) in the code, but that doesn’t work yet. We don’t have the ability to modify the source code. If you can modify one, there are ten more. Do not ask why the OpenJDK knows, because the OpenJDK directly changes the code crash (this problem also exists in Windows, so the OpenJDK developers are curious, how to solve it?) .

configureIs there a configuration parameter that directly suppresses warnings? runbash configure --help, to view help:

The help documentation does have this type of parameter, and it is enabled by default, which turns warnings during build into errors. That is simple, when running configuration, add this parameter is not OK, before there are Boot JDK error, parameter also don’t forget. Re-run bash configure:

bash configure --with-boot-jdk='/Users/opensoftware/custome-jdk-11' --disable-warnings-as-errors
Copy the code

Again, make images, if you have a bad computer, it’s going to be slow, slow. After a successful build, the following prompt is displayed.

After the build is complete, in the source code root directory, there is abuild/os-bits-variant-debuglevelFolder, which is generated when the configuration is run, and where the generated JDK is stored when the build is complete.os-bits-variant-debuglevelWill depend on the operating system, CPU architecture, JVM (i.eserver.client.minimal…). And debug level generated folder for more content, still go to seebuilding.htmlandbash configure --help, the following is my local directory and generated JDK:

Go to the JDK directory and run bin/ Java -version:

Now that you have successfully built your own JDK, you have completed the first stage of development. Congratulations, you can now put the JDK into your own environment variables and use your own JDK for future development.

Magic change JDK source code

In Java1.8, when the number of clashing links on a HashMap reaches 8, the whole clashing chain will turn into a red-black tree. And listen to a somebody else say, conflict chain is reached eight nodes into a red-black tree, is called a poisson distribution theory in support, like this can only know the top achievement method of king of volume, food fish I am sure that I want to the first time I write the words in the source code of a HashMap, every time to jump into a HashMap source code, you can see this sentence, Bow down and worship every day.

Now that we have built our own JDK, it is simply a matter of finding the HashMap source code and adding it directly. Like this:

Save, go back to the source root, rebuild, runmake images, and then the following happens:What’s going on here? To put it simply, our Chinese is not ASCII characters. That’s it. Find the compile.javaThe place of the file, injavaAdd one after the command-encoding utf-8Things will be OK. Go to the root directorymakeDirectory, openCompileJavaModules.gmkFile, find line 41, now it looks like this:

Add the code and it looks like this:

Save, go back to the source root, rebuild, runmake images.

First of all, I would like to say that this change only works on java.base module, other modules such as java.sql, java.instrument, etc. The changes go beyond CompileJavaModules. GMK, but there are a lot of things that will work in this file: java.xxx_add_JAVac_flags +=. Add -encoding UTF-8 to this format to do the trick.

Write this, think dish fish how do I know these things? If you’ve ever written a Makefile, you know where to look. Alas, the dragon slayer becomes a dragon!!

Preparation before debugging the JVM

If you just want to play with the JDK, you can skip this section. If you want to play with the JDK, you can skip this section. The IDE I use is CLion, not only to debug JVM, but also to debug Redis. Now for clarification, the JVM build tool is make, and CLion supports Cmake, so we need to convert the make project to a Cmake project. This is a Python project. If PIP is already installed, run sudo PIP install compiledb. When it is installed, run compiledb -help to check whether it is installed.

Once installed, it’s time to recompile. Go back to the source code root and run:

make CONF=macosx-x86_64-server-release compile-commands
make CONF=macosx-x86_64-server-release
Copy the code

Then go to build/macosx-x86_64-server-release and there is a file called compile_commands.json. This is the file generated for this command execution:

To import, open CLion and configure Toolchains:The configuration result is as follows:

The Name in the red box must be set to os-bits-Variant-debuglevel in the build directory. Macosx-x86_64-server release is macosx-x86_64-server release. As for the following Make, C Compiler, and C++ Compiler, it should be noted that the Debugger at the bottom is based on LLDB, and the following debugging problems are solved based on LLDB. After confirming, go back to CLion’s main screen and select New CMake Project from Sources:

And find youcompile_commands.jsonThe directory where the file resides, select Open:

After clicking OK, a pop-up window pops up, and click OK. Once in, you can find one in the debug bar:If the name of your Toolchains configuration doesn’t match the above specification, you’re going to have to leave Toolchains out, reconfigure, and import again. Now the catalog is in yourbuildNow, to reset the root directory in CLion, selectTools->CMake->Change Project Root:

Select your own OpenJDK source root directory in the popup:

The lastTools->CMake->Reload CMake Project:

At this point, the source reading environment built successfully.

Start JVM debugging

The environment we’re setting up is a pseudo-Cmake project, and there are a few things we need to configure. What are you debugging the JVM for? I want to know how to execute Java commands. So the current configuration is not enough to support your requirements, because we built the JVM to generate many executable commands, in addition to Java, jar, Javadoc, and so on. Also, if you want to debug Java, you have to have a.Java file to get you started. Start by writing a.java file:

public class Main{
    public static void main(String[] args){
        System.out.println("Hello World!!!!!!!!"); }}Copy the code

The file can be stored anywhere you like. Now to debug the configuration, click:

Select Edit Configurations… :

One thing to noteExecutable,Program argumentsandBefore launchHere,ExecutableThe contents need to be configured as you compile themjava, in thebuild/os-bits-variant-debuglevelDirectory, findjdk/binInside the directoryjavaCommand;Program argumentsI’m going to configure it to what I just wrote.javaFull path name of the file. Finally, if you are not familiar with the C/C++ compiler mechanism, you can use theBefore launchThe inside of theBuildDeleted.

At this point, click the debugging bar inside that lovely crawler, can be happy to play. But the problem always occurs when we miss a breakpoint, but the program always breaks itself, like this:

And this:

In particular, StubRoutines::call_stub(), this call is very, very, very important, and this is the entry point for function calls throughout the Java architecture, your main method, your normal method, your static method for almost all method calls. Of course, that’s not the point of this blog post, so move on to our problem. After the breakpoint is automatically entered, you can click on the small triangle to miss it:

Finally, on the console, you can see the message printed from the.java file we wrote:

Now to solve the automatic breakpoint problem, first of all, I know the cause of this problem, but I can not guarantee the correct, here will not be embarrassing. Vegetable fish know the solution, is also copied from other places, plus their own bit by bit exploration and groping. The first step is to create a new.lldbinit file in your source directory and put this init:

br set -n main -o true -G true -C "pro hand -p true -s false SIGSEGV SIGBUS"
Copy the code

Step 2, go back to your user directory, create a.lldbinit file as well, and set this to:

settings set target.load-cwd-lldbinit true
Copy the code

Then re-debug, the problem has been solved, after their own slowly explore it.

OpenJDK environment setup postscript

It took two days to finish. After all, I am still a dish to scratch the feet of the salty fish, a lot of problems are also with the help of other people’s articles to complete, here recommend some personal reference article:

  • Building the JDK This is the official OpenJDK build document
  • Develop OpenJDK in CLion with Pleasure is the main source.
  • LLDB: Read. Lldbinit in project root by default

It’s all set up, but there are bound to be some problems in the reading process. For example, I am not familiar with the syntax of C/C++, I am not familiar with the mechanism of operation, I can’t find the entrance of main function, and I have disgusting long macro definitions. These problems I will write some articles to improve it, learn together. Because everyone’s environment is different, the problems are different. For example, I have modified the configuration in Visual Studio before building the environment on Windows, and there are all kinds of disgusting dependency packages.

Even at this point, I still have to pour cold water on the research of JVM source code, is bound to learn C/C++, and not the kind of superficial, although this pile of source code does not use the latest C++ syntax, but it is also a challenge for us ordinary people. In addition to learning C/C++, I also need to do some research on operating systems and CPU instruction sets, and I also need to learn assembly language. Also have the ability to see first-hand information, even with the help of translation software, but also to develop this ability. Having said that, all of these are the frustrations I encountered while researching the JVM source code. If you give up, that’s fine. It’s nice to be able to compile your own JDK and add your own insights and comments to it.

I will translate several articles about the internal workings of the JVM and publish them when I have time.