The previous article (JS engine, run-time call stack Overview) mainly covered the JS engine, run-time call stack overview. This article will delve into the kernel of Google’s V8 JavaScript engine. We’ll also provide some advice on how to write better JavaScript code.
An overview of
A JavaScript engine is a program or interpreter that executes JavaScript code. A JavaScript engine can be implemented as a standard interpreter, or in some form a just-in-time compiler that compiles JavaScript to bytecode.
Here is a list of some of the more popular projects implementing JavaScript engines:
- V8– open source, developed by Google and written in C++
- Rhino– Managed by the Mozilla Foundation, open source and developed entirely in Java
- SpiderMonkey- The first JavaScript engine, formerly used in Netscape browsers, now used in FireFox browsers
- JavaScriptCore- Open source, sold as Nitro, developed by Apple for Safari
- KJS– The engine for KDE, originally developed by Harri Porten for the KDE project’s Konqueror Web browser
- Chakra(JScript9)–Internet Explorer
- Chakra(JavaScript)–Microsoft Edge
- Nashorn– Open source as part of OpenJDK, written by Oracle Java Languages and Tool Group
- JerryScript– a lightweight engine for the Internet of Things
Why build a V8 engine?
The V8 engine built by Google is an open source project written in C++ for use inside Google Chrome. However, unlike other engines, V8 is also used in the popular Node.js runtime.
V8 was originally designed to improve the performance of JavaScript running inside the browser. To speed things up, V8 converts JavaScript code into more efficient machine code rather than using an interpreter. Just like other JavaScript engines like SpiderMonkey or Rhino (Mozilla), V8 implements a just-in-time (JIT) compiler that compiles JavaScript code to machine code at code execution time. The main difference here is that V8 does not generate bytecode or other intermediate code.
V8 used to have two compilers
Before V8 5.9 came out, the engine used two compilers:
- Full-codegen – a simple, fast compiler that generates simple and relatively slow machine code
- Crankshaft- a more complex (JIT) optimized compiler that generates highly optimized code
The V8 engine also uses several threads internally:
- The main thread does exactly what you’d expect: it takes your code, compiles it, and executes it
- There is also another thread for compilation, so that the main thread can continue executing while the code is being optimized
- The performance analysis thread tells the runtime which methods are consuming a lot of time so that the Crankshaft can optimize them
- Threads that handle garbage collection
When JavaScript code is initially executed, V8 uses full-CodeGen to convert JavaScript directly to machine code without any conversion, which allows the engine to start executing code quickly. It is worth noting that V8 does not use intermediate bytecode, so there is no need for an interpreter.
After your code has been running for a while, the profiling thread has collected enough data to tell you which methods need to be optimized.
Next, the Crankshaft started optimizing in another thread. It converts the JavaScript abstract syntax tree into a higher-order static singleton form called Hydrogen and attempts to optimize the Hydrogen diagram, most of which is done at this stage.
inline
The first step is to inline as much code as possible ahead of time. Inlining replaces the calling address (the line of code where the function is called) with the body of the function being called. This simple step makes the following optimizations more meaningful.
Hidden Classes
JavaScript is a prototype-based language: there are no classes, and objects are constructed using clones. JavaScript is a dynamic programming language, meaning that properties can be easily added or removed after an object is initialized.
Most JavaScript interpreters use a class dictionary structure (based on hash functions) to store the in-memory location of an object’s attribute value. This structure makes retrieving the value of an attribute more computationally required in JavaScript than in other non-static languages such as Java or C#. In Java, all object properties are determined by a fixed object before compilation and cannot be added or removed dynamically at run time (well, C# has a dynamic type, and that’s a topic for another day). As a result, attribute values (or Pointers to those attributes) can be stored in a continuous buffer in memory, with fixed offsets from each other. The length of this offset can be easily determined based on the type of the attribute, which is impossible in JavaScript, where the attribute type can be changed at run time.
Because it is inefficient to use dictionaries to locate object attributes in memory, V8 uses a different approach instead: hidden classes. Hidden classes are much like the fixed object layer (classes) used in Java, except that they are created at run time. Now let’s see what they look like:
function Point(x, y) {
this.x = x;
this.y = y;
}
var p1 = new Point(1, 2);
Copy the code
Whenever “new Point(1,2)” is executed, V8 creates a hidden class called “C0”.
Point has no attributes yet, so “C0” is empty.
As soon as the first expression “this.x = x” is executed (inside the “Point” function), V8 creates a second “C0” based hidden class called “C1”. “C1” describes the location of the attribute X in memory (the associated object pointer). Here, “x” is stored at offset 0, which means that when a point object is treated as a contiguous buffer in memory, the first offset points to the property “x”. V8 will also update “C0” with “class conversion”, indicating that if the attribute “x” is added to the Point object, the hidden class will switch from “C0” to “C1”. The hidden class for the Point object in the figure below is now “C1”.
Each time a new attribute is added to an object, the old hidden class updates the transformation path to the new hidden class. Hidden class conversions are important because they allow hidden classes to be shared among objects created in the same way. If two objects share a hidden class, and the same attributes are added to them, the transformation ensures that both objects get the same new hidden class and all the optimized code that goes with it.
This process is repeated when the expression “this.y = y” is executed (also after the “this.x = x” expression in the Point function).
A hidden class named “C2” was created, and a cast was added to “C1” to indicate that if an attribute y was added to the Point object (which already contains attribute “x”), the hidden class would become “C2” and the hidden class of the Point object would be updated to “C2”.
Hidden class conversions rely on attributes being added to the object, as shown in the following code snippet:
function Point(x, y) {
this.x = x;
this.y = y;
}
var p1 = new Point(1, 2);
p1.a = 5;
p1.b = 6;
var p2 = new Point(3, 4);
p2.b = 7;
p2.a = 8;
Copy the code
Now you might assume that P1 and P2 will use the same hidden classes and transformations. Not so. For “p1”, attribute “a” is added first, followed by attribute “b”. Whereas “p2”, “b” is assigned first, then “a”. Thus, “P1” and “P2” results will use different hidden classes and different conversion paths. Therefore, it is best to initialize the dynamic properties in the same order, so that the hidden classes can be reused.
Inline cache
V8 leverages another technique called inline caching to optimize dynamically typed languages. Inline caching relies on repeated calls of the same method on the same object type. An in-depth explanation of the inline cache can be found here.
We’ll cover the general concept of inline caching (in case you don’t have time to read the in-depth explanation above).
How does inline caching work? V8 maintains a cache of the object types of arguments passed in recent method calls, using this information to make assumptions about the object types of arguments passed in the future. If V8 can make good assumptions about the type of object passed to the method, it can bypass the process of guessing how to access object properties and instead use the previously saved information to find the object’s hidden class.
So how do hidden classes and inline caches relate? When a method on a particular object is called, the V8 engine must look for that object’s hidden class to determine the offset to access a particular property. After two successful calls to the same method of the same cached class, V8 skips the lookup of the hidden class and simply adds the offset of the property to the object pointer. On all subsequent calls to that method, the V8 engine assumes that the hidden class has not changed and jumps directly to the memory address of that particular property using the offset stored in the previous lookup. This dramatically improves the speed of execution.
Inline caching is also why it is so important for objects of the same type to share inline caching. If you create two objects of the same type with different hidden classes (as in the previous example), V8 will not use inline caching because even though the objects are of the same type, their corresponding hidden classes assign different offsets to their attributes.
The objects are essentially the same, but the “A” and “b” properties are created in different order.
Compile to machine code
As soon as the Hydrogen diagram is optimized, the Crankshaft drops it to a lower-order form called Lithium. Most Lithium implementations are structure-specific. Registered memory allocation occurs in this phase.
Finally, Lithium is compiled into machine code. Then something called OSR: on-stack replacement happened. Before we compile and optimize an apparently long-running method, we might want to run it first. V8 will not forget the method that just ran slowly, and will use the optimized version to run it. Instead, it will transform all the contexts we have (stacks, registers) so that we can replace them with the optimized version in the middle of execution. This is a very complex task, and remember that in other optimizations, V8 inlines the code from the start. V8 is not the only engine to do this.
There is a safeguard called de-optimization, which reverts the code to unoptimized code with the reverse transformation, in case the engine’s assumptions are no longer correct.
The garbage collection
For garbage collection, V8 uses the traditional mark-sweep generational approach to remove old generations. The tagging phase stops JavaScript execution. To control GC consumption and make execution more stable, V8 uses increment flags: Instead of traversing the entire heap, it tries to flag possible objects, traverses only a portion of the heap, and then resumes normal execution. The next GC will start where the last heap walk left off. This only causes a brief pause in normal execution. As mentioned earlier, the cleanup phase will be handled by another thread.
Ignition and TurboFan
V8 5.9, released in 2017, introduced a new execution pipeline. This new pipeline achieves even greater performance improvements and significant memory savings in real JavaScript projects.
The new pipeline is built on top of Ignition, V8’s interpreter, and TurboFan, V8’s latest optimized compiler.
You can see the V8 team’s blog on the subject here.
Since V8’s 5.9 release dependencies, full-CodeGen and crankshafts (which had been serving V8’s technology since 2010) were not used by V8 for JavaScript execution, The V8 team needs to keep up with new JavaScript language features and optimizations for those features.
This means that overall V8 will have a much simpler and more maintainable architecture in the future.
These improvements are just the beginning. The new Ignition and TurboFan pipelines pave the way for future optimizations that will dramatically improve JavaScript performance and keep V8’s footprint on Chrome and Node.js even stronger for many years to come.
Finally, here are some tips and tricks on how to write well-optimized and better JavaScript. You can easily draw these conclusions from the above, but here’s a summary for your convenience:
How to write performance-optimized code
- Order of object attributes: Always instantiate object attributes in the same order, so that hidden classes and subsequent optimization code can be shared.
- Dynamic properties: Adding properties to an object after instantiation forces changes to the hidden class, slowing down any methods previously optimized for the hidden class. Instead, assign values to all attributes of the object in the constructor.
- Methods: Code that repeatedly runs the same method is faster than running a different method each time (because of inline caching)
- Arrays: Avoid sparse arrays with keys that are not incrementing numbers. A sparse array where not every element is inside is a hash table. The elements of this array consume more resources to access. Also avoid allocating large arrays in advance; it is best to allocate them only when needed. Finally, do not delete elements of the array, as this will make the key sparse.
- Label values: V8 uses 32 bits to represent objects and numbers. It is denoted by one bit as an object (flag = 1) or a number (flag = 0) and is called SMI (SMall Integer) because of its 31 bits. Then, if a number value is greater than 31, V8 will box the number, convert it to a double, and create a new object to put it in. Try to use 31 signed digits at all times to avoid expensive boxing operations into a JS object.
The article translation: blog.sessionstack.com/how-javascr…