Big beard

Translation: the original huziketang.com/blog/posts/… A crash course in Just-in-time (JIT) compilers

Please indicate the source, keep the original link and author information


This is the second article in the WebAssembly series. If you haven’t read the previous article, I suggest you read this first. If you are not familiar with WebAssembly, you are advised to read this article first.

JavaScript is slow to start, but it can be made faster by JIT, so how does JIT work?

How does JavaScript work in a browser?

If you’re a developer, there are two things to consider when deciding to use JavaScript in your pages: goals and problems.

Goal: Tell the computer what you want to do.

Problem: You and the computer speak different languages and cannot communicate.

You speak human language, and computers use machine language. Machine language is a language, but JavaScript or some other high-level programming language can be read by machines and humans can communicate without them. They are designed based on human cognition.

So, the job of a JavaScript engine is to translate human language into a language that machines can understand.

It’s like the interaction between humans and aliens in the movie Arrival.

In the movie, humans and aliens not only have different languages, but also have different ways of seeing the world. Humans and machines are similar (more on that later).

So how does translation work?

In the code world, there are usually two ways to translate machine language: an interpreter and a compiler.

If done through an interpreter, translation is performed line by line as it is explained

A compiler compiles the entire source code into object code, which runs directly on a platform that supports object code without the need for a compiler.

Both of these translation methods have their own advantages and disadvantages.

Pros and cons of interpreters

The interpreter starts and executes faster. You don’t need to wait for the entire compilation process to complete before you can run your code. Start translating from the first line and you can continue.

It is for this reason that the interpreter looks better suited to JavaScript. It’s important for a Web developer to be able to execute code quickly and see the results.

That’s why the first browsers used JavaScript interpreters.

When you run the same code more than once, however, the interpreter’s shortcomings become apparent. For example, if you execute a loop, the interpreter will have to translate again and again, which is inefficient.

The pros and cons of compilers

Compilers have the opposite problem.

It takes some time to compile the entire source code and then generate the object file for execution on the machine. Code with loops executes faster because it does not have to translate each loop repeatedly.

Another difference is that the compiler can spend more time optimizing the code to make it run faster. The interpreter does this at Runtime, which means it can’t spend much time optimizing for translation.

Just-in-time compiler: combines the best of both

To address the inefficiency of the interpreter, later browsers introduced compilers into the mix.

Different browsers implement this in different ways, but the basic idea is the same. Add a monitor (also called a profiler) to the JavaScript engine. The monitor monitors how the code is running, keeping track of how many times it has been run, how it is running, and so on.

Initially, the monitor monitors all code passing through the interpreter.

If the same line of code is run several times, the snippet is marked as “warm”, and if it is run many times, it is marked as “hot”.

Baseline compiler

If a piece of code becomes “warm”, the JIT sends it to the compiler for compilation and stores the compilation results.

Each line of the code snippet is compiled into a stub, which is assigned an index of line number + variable type. If the monitor monitors the execution of the same code and the same variable type, push the compiled version directly to the browser.

You can speed up execution by doing this, but as I said earlier, the compiler can also find ways to execute code more efficiently by doing optimizations.

The baseline compiler can do some of these optimizations (I’ll show you some examples below), but it can’t be optimized for too long because it makes the execution of the program hold.

But if the code is really “hot” (that is, almost all of the execution time is spent there), then it’s worth the time to optimize it.

Optimized compiler

If a piece of code becomes “very hot,” the monitor sends it to the optimized compiler. Generate a faster and more efficient version of the code, and store it.

To produce a faster version of the code, the optimization compiler must make some assumptions. For example, it assumes that all instances generated by the same constructor have the same shape — that is, all instances have the same attribute names and are initialized in the same order, so you can optimize for this pattern.

The chain in which the optimizer works is that the monitor makes its own judgments about the execution of the code it is monitoring, and then passes the information it collates to the optimizer for optimization. If the objects in each previous iteration of a loop have the same shape, it is assumed that the objects in subsequent iterations have the same shape. This is never guaranteed with JavaScript; the first 99 objects hold shape, and the 100th May be missing an attribute.

Because of this, compiling code requires checking whether its assumptions are sound before running it. If it is reasonable, the optimized compiled code will run; if not, the JIT will assume that it made a wrong assumption and throw the optimized code away.

At this point (in the case of optimized code discarding) the execution will go back to the interpreter or baseline compiler, a process called de-tuning.

Optimizing the compiler often makes the code faster, but some situations can cause unexpected performance problems. If your code is stuck in a cycle of optimizing <-> to optimize, it will be slower than the baseline compiler.

Most browsers have restrictions that try to break out of the optimize/de-optimize cycle when it occurs. For example, if the JIT optimizes more than 10 times and then drops the operation, it will no longer attempt to optimize the code.

An example of optimization: Type Specialization

There are many different types of optimization methods, but I’m going to introduce one just so you can see how it works. One of the most successful features of optimized compilers is called type specialization, which is explained below.

The dynamic typing system used by JavaScript requires additional interpretation at runtime, such as the following code:

function arraySum(arr) {
  var sum = 0;
  for (var i = 0; i < arr.length; i++) { sum += arr[i]; }}Copy the code

This step in the += loop looks simple enough to do one calculation, but because it’s dynamic typing, it’s a little more complicated than you might think.

Let’s assume that arR is an array of 100 integers. When code is marked as “warm”, the baseline compiler generates a stake for each operation in the function. Sum += arr[I] will have a corresponding pile and treat the += operation inside as integer addition.

However, the numbers sum and arr[I] are not guaranteed to be integers. Since types are dynamically typed in JavaScript, arr[I] will most likely become a string in the next loop. Integer addition and string concatenation are two completely different operations that are compiled to different machine code.

The JIT approach to this problem is to compile multiple baseline piles. If a code block is monoform (that is, always called with the same type), only one peg is generated. If it is polymorphic (that is, the type changes over the course of the call), a peg is generated for each combination of types invoked by the operation.

This means that the JIT makes a multi-branch selection before selecting a pile, similar to a decision tree, and asks itself many questions before deciding which pile to choose. See the following figure:

Because each line of code has its own stake in the baseline compiler, the JIT checks the data type as each line of code is executed. The JIT also repeats the branch selection at each iteration of the loop.

Code execution would be faster if the JIT didn’t double-check it every time it ran, which is one of the things you need to do to optimize the compiler.

In the optimized compiler, the entire function is compiled uniformly so that type checking can be done before the loop starts executing.

Some browser JIT optimizations are more complex. In Firefox, for example, some arrays are set to specific types, such as containing only integers. If arR is of this array type, then the JIT does not need to check whether arr[I] is an integer, which means that the JIT can do all the type checking before entering the loop.

conclusion

What is JIT in short? It is a way to make JavaScript run faster by optimizing hot code (code that is executed multiple times) by monitoring how the code is running. In this way, the performance of JavaScript applications can be improved many times.

JIT adds a lot of extra overhead to make execution faster. These include:

  • Optimize and de-optimize overhead
  • The monitor records the memory overhead of information
  • The memory cost of recovering records of information when de-tuning occurs
  • Memory overhead for baseline and optimized version records

There’s still a lot of room for improvement: eliminating overhead. Further performance gains are achieved by eliminating overhead, which is one of the things WebAssembly does.


I’m currently working on a little book called React.js.