This article is translated
原文标题:A crash course in just-in-time (JIT) compilers
Originally written by Lin Clark
The original address: hacks.mozilla.org/2017/02/a-c…
This is the second part of a WebAssembly article series. If you haven’t read the others, I suggest youFrom the very beginning.
JavaScript was slow at first, but thanks to something called JIT, it got faster. So how does JIT work?
How does JavaScript work in a browser
When you, as a developer, add JavaScript to a page, you have a goal and a problem.
Goal: Tell the computer what to do.
Problem: You don’t speak the same language as the computer.
You speak human language, and computers speak machine language. Although many people don’t like to believe it, JavaScript or any other high-level programming language is a human language. They are designed for human cognition, not machines.
So the JavaScript engine’s job is to translate your human language into something that machines can understand.
I think it’s like Arrival 2. Humans and aliens try to communicate.
In this movie, humans and aliens don’t go word for wordtranslation. The two groups have different views of the world. The same is true for humans and machines (which I’ll explain further in the next article).
So, how does translation work?
In programming, there are usually two ways to translate into machine language. We can use an interpreter or a compiler.
With the interpreter, this translation happens almost line by line in real time.
The compiler uses a different approach, which translates ahead of time and records the translation, rather than doing it on the fly.
These two kinds oftranslationEach has its advantages and disadvantages.
Advantages and disadvantages of the interpreter
The interpreter can quickly get up and running the code. You don’t have to complete the entire compile phase to start running the code. It can be translated and run line by line.
The interpreter seems a natural fit for JavaScript. Because it’s important for Web developers to be able to run their code quickly.
This is indeed why browsers used JavaScript interpreters in the first place.
However, the interpreter’s shortcomings arise when the same code is run more than once. For example, in a loop, it will have to make the same translation over and over again.
Advantages and disadvantages of compilers
Compilers weigh the pros and cons in the opposite way that interpreters do.
It takes more time to start because the compile phase has to take place first. But because you don’t have to repeat the translation in the loop, the code in the loop runs faster.
Another difference is that the compiler has more time to analyze and modify the code so that it can run faster. This modification is called optimization.
The interpreter works at runtime, so there is not enough time to calculate these optimizations during translation.
Just-in-time compilers: Get the best of both worlds
To avoid the inefficiencies of the interpreter, which required retranslation of the code each time it went through the loop, browsers began to blend in with compilers.
Different browsers do this in slightly different ways, but the basic idea is the same. They add monitor (aka profiler) 3 as a new component to the browser engine. The monitor turns on monitoring when the code starts running and records how many times the code has been run and what type of use it has.
Initially, the monitor runs all code only through the interpreter.
If the same line of code has been run several times, it is called warm. If it is run too many times, the code is called HOT 4.
Baseline compiler
When a function starts to warm, the JIT sends it out for compilation. The JIT then stores the compiled results.
Each line of the function is compiled separately into a “pile code” 5, which is indexed by line number and variable type (I’ll show you why this is important later). If the monitor detects that the same code has been executed (with the same variable type) in code execution, the compiled version is fetched.
This helps speed up the code. But as I mentioned earlier, compilers can do more than that. It can also take some time to figure out how the code works most efficiently, and then optimize it.
The baseline compiler performs the optimizations I mentioned above (I’ll give examples later). Optimization should not take too long, however, as it can cause code execution to block.
However, if the code is really hot, that is, it runs a lot, then it is worth the extra time to optimize it.
Optimizing the Optimizing Compiler
When a piece of code is hot, the monitor sends it to the optimization compiler. This will create and save another version of the function code that will run faster than the original.
To make the code run faster, the optimization compiler has to make some assumptions.
For example, if you can assume that all objects are created by a particular formal constructor — that is, objects always have the same attribute names added in the same order — then you can take some shortcuts based on that assumption.
By observing the information gathered by code execution on the monitor, the optimization compiler can make a judgment. If something has happened in previous cycles, assume that it will continue to happen.
But, of course, with JavaScript, this is not guaranteed. Even if 99 objects have the same structure, the 100th May be missing an attribute.
Therefore, you need to check your compiled code to make sure that your assumptions are still valid before you run it. If it passes the check, the code is executed. But if it doesn’t, the JIT thinks it made the wrong assumption and purges the optimized version of the code.
The code then falls back to the version compiled by the interpreter or baseline compiler for execution. This process is called de-optimization 6.
In general, optimizing compilers speed up code, but sometimes they cause unexpected performance problems. If the code is continuously optimized and then optimized, it will end up being slower than the code compiled by the baseline compiler.
Most browsers add restrictions to interrupt the potential optimization/de-optimization cycle. If the JIT has done, say, 10 optimizations and is constantly cleaning up the optimized code, it will give up optimizations.
An optimization example: Type Specialization
There are many different types of optimization, but I want to introduce one to explain how it happens. In optimization compilers, type specialisation is one of the most effective optimization methods.
The dynamic typing system used by JavaScript relies on the runtime to do some extra processing. For example, consider the following code:
function arraySum(arr) {
var sum = 0;
for (var i = 0; i < arr.length; i++) { sum += arr[i]; }}Copy the code
The += step in the loop may seem simple enough to complete the calculation in one step, but due to the use of dynamic typing, there are more steps than you might expect.
Suppose arR is an array of 100 integers. Once the code has warmed, the baseline compiler creates a piece of code for each operation in the function. Therefore, there will be a stub code for sum += arr[I], which will treat += as integer addition.
However, sum and arr[I] are not necessarily integers. Because types are dynamic in JavaScript, arr[I] can be changed to a string in subsequent loops. Integer addition and string concatenation are two completely different operations, so they compile to completely different machine code.
The JIT handles this problem by compiling multiple pieces of baseline code. If a piece of code is single form (multiple code calls, variable types always the same), a piece of stub code will be generated for it; If it is polymorphic (multiple code calls, possibly of different variable types), a chunk of code will be generated for each possible combination of variables.
This means that the JIT has to identify many issues before selecting a piece of code.
Because each line of code has its own set of stubs in the baseline compiler, the JIT needs to check for type each time a line is executed. As a result, the JIT has to ask the same question each iteration in the loop.
If the JIT doesn’t have to do this kind of repetitive checking, the code executes faster. And that’s exactly what the optimization compiler does.
In an optimized compiler, all statements of the entire function are compiled together. Type checking can then be moved to the front of the loop.
Some JIts even optimize this further. For example, in Firefox, arrays containing only integers are a special category. If arR is judged to be such an array, then the JIT does not need to check if arr[I] is an integer. This means that all type checking can be done before the loop.
conclusion
In a nutshell, this is JIT. It monitors code execution and optimizes HOT code to make JavaScript run faster. Its presence has improved the performance of most JavaScript applications many times over.
Despite these improvements, JavaScript performance is patchy. To make performance faster overall, there are a number of overhead associated with JIT running, including:
- Optimize and de-optimize
- Memory consumed by the monitor to record monitoring information
- Memory consumption to optimize recovery information
- Stores memory consumed by the compilation versions of different functions of the baseline compiler and the optimization compiler
There is room for improvement in this area, where overhead can be eliminated and performance can be more stable. And that’s what WebAssembly can do.
In the next article, I’ll explain more about assemblies and how compilers work with them.
-
The original text for translation, translation, translation, translation of the meaning ↩
-
Arrival (2016 Dennis Villeneuve)↩
-
Monitor (aka a profiler)↩
-
Warm and hot are used to represent the code execution frequency. To be hot to be hot to be hot to be hot In Chinese we use words like “hot” and “hot” to describe something similar. ↩
-
A Stub (Stub/Method Stub) is a program segment used to replace some functions. A stub program can be used to simulate the behavior of an existing program (such as the process of a remote machine) or as a temporary substitute for code to be developed. Therefore, piling is very useful in program migration, distributed computing, and general software development and testing.
Here should be understood as “a piece of code can be called”, and more discussion of pile, you can refer to www.zhihu.com/question/24… ↩
-
Originally deoptimization or bailing out↩