[JavaScript Weekly #399] Fundamentals of the JavaScript Engine (part 2) : Prototype optimization

🥳 welcome interested partners, do something meaningful together! Translator: Dao Li

I launched a weekly translation project at github.com and fedarling.github. IO

Currently, there is still a lack of like-minded partners, which is purely personal interest. Of course, it will also help to improve English and front-end skills. Requirements: English should not be too bad, proficient in using Github, persistent, modest, and responsible for what they do.

If you want to participate, you can either click here to view it or send an issue message to the warehouse. My blog also has specific personal contact information: daodaolee.cn

This article describes some key basics that are common to JavaScript engines — not just V8. As a JavaScript developer, understanding how the JavaScript engine works can help you write better code.

If you didn’t lookPrevious article: JavaScript Engine Basics (part 1) : Morphology and inline cachingPlease be sure to check out this article for many related terms.

In the previous article, we discussed how the JavaScript engine can optimize object and array access by using shapes and inline caching. This article looks at how to optimize pipeline tradeoffs and how the engine can speed up access to prototype attributes.

Optimize hierarchy and execute tradeoffs

The previous article discussed how modern JavaScript engines have the same pipeline:

We also pointed out that while the high-level pipeline between engines is somewhat similar, there are often differences in optimizing the pipeline. Why is that? Why do some engines have more levels of optimization than others? It turns out that there’s a trade-off between the fastest running code and the best performance:

Interpreters can generate bytecodes quickly, but bytecodes are often inefficient. Optimizing the compiler, on the other hand, takes longer to execute, but ultimately results in more efficient machine code.

This is the model used by V8. V8’s interpreter, called Ignition, is the fastest interpreter of all (in terms of raw bytecode execution speed). V8’s optimized compiler is called TurboFan, which ultimately generates highly optimized machine code:

The trade-off between startup latency and execution speed is why some JavaScript engines choose to add an optimization layer between the two. For example, SpiderMonkey has added a base layer between the interpreter and the IonMonkey optimization compiler:

The interpreter generates bytecode quickly, but bytecode execution is relatively slow. Baseline code generation takes longer, but it provides better runtime performance. Finally, the IonMonkey optimized compiler takes the longest to generate machine code, but that code can run very efficiently.

Let’s look at a concrete example of how pipes in different engines handle it. Here is some code that is often repeated in thermal cycles:

let result = 0;
for (let i = 0; i < 4242424242; ++i) {
	result += i;
}
console.log(result);
Copy the code

V8 starts running bytecode in the Ignition interpreter. At some point, the engine determines that the code has reached a hot point and turns on TurboFan, the part of the engine that handles the integration analysis data and the basic machine representation of the building code. It is then sent to the TurboFan optimizer on a different thread for further improvement:

While the optimizer is running, V8 continues to execute bytecode in Ignition. At some point the optimizer is done, we have executable machine code, and we can continue to execute.

Starting with Chrome 91 (released in 2021), V8 adds a compiler called Sparkplug between the Ignition interpreter and TurboFan optimized compiler.

The SpiderMonkey engine also starts running bytecode in the interpreter. But it has an additional base layer, which means that the hot code is sent to the base layer first, and the benchmark compiler generates the base code on the main thread and continues execution when it is ready.

After the benchmark code has been running for a while, SpiderMonkey eventually launches IonMonkey and starts the optimizer — much like V8. While IonMonkey optimizes, it continues to run in the base layer. Finally, when the optimizer is complete, the optimized code is executed instead of the benchmark code.

Chakra’s architecture is very similar to SpiderMonkey’s, but Chakra tries to run more things at once to avoid blocking the main thread. Instead of running any part of the compiler on the main thread, Chakra copies the bytecode and data that the compiler might need and sends it to a dedicated compiler process:

When the generated code is ready, the engine starts running the SimpleJIT code instead of bytecode. FullJIT is the same way. The advantage of this approach is that the pause time for replication to occur is usually much shorter than running the FullJIT compiler. The downside of this approach, however, is that this replication may leave out some of the information needed for optimization, so it reduces code quality to some extent in exchange for delay.

In JavaScriptCore, all optimized compilers run in full concurrency with the main thread JavaScript, notice! There is no copy! Instead, the main thread only triggers compilation jobs on another thread. The compiler then accesses the analysis data from the main thread using a complex locking scheme:

The advantage of this approach is that it reduces the lag caused by JavaScript optimizations on the main thread, while the disadvantage is that it requires dealing with complex multithreading issues and incurring some locking costs for various operations.

So far, we’ve discussed the tradeoff between using an interpreter to generate code quickly or using an optimized compiler to generate code quickly. But there’s another tradeoff: memory usage! To illustrate this, here’s an example of code that adds two numbers:

function add(x, y) {
	return x + y;
}

add(1.2);
Copy the code

Here is the bytecode we generated for the add function using the Ignition interpreter in V8:

StackCheck
Ldar a1
Add a0, [0]
Return
Copy the code

Without fully understanding the bytecode, it is easy to see that it has only four instructions.

When the code state becomes hot, TurboFan generates the following highly optimized machine code:

leaq rcx,[rip+0x0]
movq rcx,[rcx-0x37]
testb [rcx+0xf].0x1
jnz CompileLazyDeoptimizedCode
push rbp
movq rbp,rsp
push rsi
push rdi
cmpq rsp,[r13+0xe88]
jna StackOverflow
movq rax,[rbp+0x18]
test al,0x1
jnz Deoptimize
movq rbx,[rbp+0x10]
testb rbx,0x1
jnz Deoptimize
movq rdx,rbx
shrq rdx, 32
movq rcx,rax
shrq rcx, 32
addl rdx,rcx
jo Deoptimize
shlq rdx, 32
movq rax,rdx
movq rsp,rbp
pop rbp
ret 0x18
Copy the code

Wow, that’s a lot of bytecode! In general, bytecode tends to be more compact than machine code, especially optimized machine code. Bytecode, on the other hand, requires an interpreter to run, and optimized code can be executed directly by the processor.

That’s why JavaScript engines don’t just “optimize all code”. As we saw earlier, it takes a long time to generate optimized machine code, and on top of that, we just learned that optimized machine code also requires more memory.

Summary: JavaScript engines have many optimization layers because you need to use an interpreter to generate code quickly, and you need to use an optimized compiler to generate code quickly. This is a scoping thing, and adding more layers of optimization lets you choose between additional complexity/overhead and more fine-grained decisions. In addition, there is a tradeoff between the level of optimization and the memory usage of the generated code. This is why JavaScript engines try to optimize only hot functions.

Optimize stereotype property access

The previous article explained how JavaScript engines use Shapes and IC to optimize object property loading. As a refresher, the engine stores the Shape of an object separately from its value:

Shapes supports an optimization called Inline Caches (IC). Together, Shapes and ICS speed up repeated property access at the same place in your code.

Classes and prototype-based programming

Now that we know how to quickly access properties on JavaScript objects, let’s look at a recent addition to JavaScript: Classes. The JavaScript class syntax looks like this:

class Bar {
	constructor(x) {
		this.x = x;
	}
	getX() {
		return this.x; }}Copy the code

Although this may seem like a new concept in JavaScript, it’s just syntactic sugar for prototype-based programming and has always been used in JavaScript:

function Bar(x) {
	this.x = x;
}

Bar.prototype.getX = function getX() {
	return this.x;
};
Copy the code

Here we assign a getX property on the bar.prototype object. This works exactly like any other object, because prototypes are just objects in JavaScript, too! In a range of prototype-based programming languages such as JavaScript, methods are shared through stereotypes, and fields are stored in concrete instances.

Take a look at what happens behind the scenes when we create a new Bar instance named Foo:

const foo = new Bar(true);
Copy the code

The instance created by running the above code has a Shape with a single property “x”. Foo’s prototype is Bar. Prototype:

Bar.prototype has its own Shape that contains a property called “getX” whose value is the function getX, which returns only this.x when called. Prototype is the JavaScript Object. Prototype. Object. Prototype is the root of the prototype tree, so its prototype is NULL.

If you create another instance of another class, then both instances share the Shape object we discussed earlier: both instances point to the same Bar.prototype object.

Stereotype attribute access

Ok, now we know what happens when we define a class and create a new instance. But what happens if we call methods on instances?

class Bar {
	constructor(x) { this.x = x; }
	getX() { return this.x; }}const foo = new Bar(true);
const x = foo.getX();
/ / ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
Copy the code

Let’s break it up:

const x = foo.getX();

// Two steps

const $getX = foo.getX;
const x = $getX.call(foo);
Copy the code

The first step is to load the method, which is just a property on the prototype (whose value happens to be a function), and the second step is to call the function with an instance of the value this. Let’s look at the first step, which is to load the method getX from instance foo:

The engine starts with an instance of Foo and finds that foo has no ‘getX’ property on its Shape, so it must traverse the prototype chain. We go to Bar.prototype and look at its prototype Shape and see that it has a “getX” attribute at offset 0. We look for the value at offset in bar.prototype and find the JSFunction getX we are looking for. That’s the whole process!

JavaScript can alter prototype chain links with its own unique flexibility, such as:

const foo = new Bar(true);
foo.getX();
/ / to true

Object.setPrototypeOf(foo, null);
foo.getX();
// → Uncaught TypeError: foo. GetX is not a function
Copy the code

In this example, we call foo.getx () twice, but each time it has a completely different meaning and result. That’s why? Even though prototypes are just objects in JavaScript, it is more difficult for JavaScript engines to speed up access to prototype properties than to speed up access to their own properties on regular objects.

Looking at this code, loading stereotype properties is a very frequent operation: it happens every time a method is called!

class Bar {
	constructor(x) { this.x = x; }
	getX() { return this.x; }}const foo = new Bar(true);
const x = foo.getX();
/ / ^ ^ ^ ^ ^ ^ ^ ^ ^ ^
Copy the code

Earlier, we discussed how the engine optimizes loading of regular, self-contained properties by using Shape and inline caching. So how do we optimize stereotype properties for repeatedly loading objects with similar shapes? In the figure above we can see how property loading occurs:

To speed up reloading in this particular case, we need to know these three things:

fooShape does not contain'getX'And it didn’t change. This means that no one changes the object by adding, removing, or changing propertiesfoo.
fooThe prototype is still the originalBar.prototype. This means that no one passes the useObject.setPrototypeOf()Or set special__proto__Property to changefooThe prototype.
Bar.prototypeThe Shape of contain"GetX"And it didn’t change. This means that no one is adding, removing, or changing attributes to make changesBar.prototype.

In general, this means that we have to perform one check on the instance itself, plus two checks for each stereotype, until the stereotype has the attributes we are looking for. The 1+2N check (where N is the number of prototypes involved) may not sound too bad in this case, since the prototype chain is relatively shallow – but the engine usually has to deal with longer prototype chains, such as classes in the case of the common DOM. Take this example:

const anchor = document.createElement('a');
/ / > HTMLAnchorElement

const title = anchor.getAttribute('title');
Copy the code

We have a HTMLAnchorElement and call the getAttribute() method on it. This has involved six prototypes! Most useful DOM methods are not on the direct HTMLAnchorElement prototype, but higher up the chain:

The getAttribute() method can be found on element.prototype. This means that every time we call anchor. GetAttribute (), the JavaScript engine needs to do…..

check"The getAttribute"If notanchorOn the object itself
Check the direct prototype isHTMLAnchorElement.prototype
Make sure there’s no”The getAttribute"
Check that the next prototype is HTMLElement.prototype
Make sure it’s not there either"The getAttribute"
The next prototype for final inspection isElement.prototype
And there it is"The getAttribute"

There were seven checks! This type of code is common on the web, where the engine applies tricks to reduce the number of checks required to load properties on the prototype.

Going back to the previous example, we performed a total of three checks when accessing ‘getX’ on foo:

class Bar {
	constructor(x) { this.x = x; }
	getX() { return this.x; }}const foo = new Bar(true);
const $getX = foo.getX;
Copy the code

For each prototype object that carries this property, Shape checks are needed to see if it is missing. It would be great if we could reduce the number of checks by turning prototype checks into absenteeism checks. This is essentially what the engine does using a simple trick: Instead of storing the prototype chain on the instance itself, the engine stores it in a Shape.

Each Shape points to a prototype. This also means that every time foo’s prototype changes, the engine converts to a new Shape. Now we just need to examine an object’s Shape to assert whether certain properties are missing and to protect prototype links.

In this way, we can reduce the number of checks required from 1+2N to 1+N to access the properties of the prototype more quickly. But it’s not cheap, and the longer the prototype chain, the higher the cost. The engine implements different tricks to further reduce the number of checks, especially for subsequent executions of the same property load.

Valid cell

V8 specifically treats this prototype Shape. Each stereotype has a unique Shape that is not shared with any other objects (especially other stereotypes), and each of these stereotype shapes has a special ValidityCell associated with it.

This ValidityCell is invalidated every time someone changes the relevant stereotype or any stereotype on it. Let’s see how it works.

To speed up subsequent loading of prototypes, V8 places an inline cache containing four fields:

On the first run of this code to the preheated inline cache, V8 remembers to find the offset of the attribute in the prototype, the prototype of the attribute found (currently bar.prototype), the instance’s Shape (currently Foo’s Shape), And a link to the current ValidityCell (currently bar.Prototype) of the direct prototype linked to from the instance Shape.

The next time the inline cache is hit, the engine must check the instance’s Shape and ValidityCell. If it is still valid, the engine can access offset directly on the prototype, skipping the extra lookup:

When the stereotype changes and a new Shape is assigned, the previous ValidityCell is invalidated. As a result, Inline Cache will be lost the next time it is executed, resulting in poor performance.

Going back to the previous DOM element example, this means that for Object.prototype, it invalidates not only the inline cache of Object.prototype itself, but also any of the following prototypes, Including the EventTarget. The prototype, the Node. The prototype, the Element. The prototype, etc., has been to HTMLAnchorElement. The prototype:

In practice, modifying Object.prototype while running the code means that performance is not a priority. Try not to do that!

Let’s explore this further with a concrete example. Suppose we have our class Bar, and we have a function loadX that calls the methods of the Bar object. We call this loadX function multiple times with an instance of the same class:

class Bar { / *... * / }

function loadX(bar) {
	return bar.getX(); // IC of getX on Bar instance.
}

loadX(new Bar(true));
loadX(new Bar(false));
// IC in loadX now links ValidityCell for bar.prototype.

Object.prototype.newMethod = y= > y;
// loadX IC ValidityCell is now invalid,
// Because object. prototype has changed.
Copy the code

The inline cache in loadX now points to the ValidityCell of Bar.prototype. If you do something later like change Object.Prototype. ValidityCell will fail, and the existing inline cache will be lost the next time it is hit, resulting in performance degradation.

Try not to change Object.prototype, as it invalidates any inline caches loaded by the prototype that the engine placed prior to this. Here’s another example of a mistake:

Object.prototype.foo = function() { / *... * / };

someObject.foo();

delete Object.prototype.foo;
Copy the code

We extended Object.prototype to invalidate any inline cache of prototypes placed by the engine up to this point. Then we run some code that uses the new prototype approach. The engine must start from scratch and set up a new inline cache for all stereotype property access. Finally, we “self-clean” and remove the prototype methods we added earlier.

“Self-cleaning” looks good, but in this case, it makes things worse! Removing this property changes Object.prototype so that all inline caches are invalidated again and the engine has to start from scratch again.

Summary: Although prototypes are just objects, they are treated specially by the JavaScript engine to optimize the performance of method look-ups on prototypes. Forget your prototype! Or, if you really need to touch the prototype, do it before the rest of the code runs, so you don’t at least invalidate all the optimizations in the engine while the code is running.

The last

We’ve seen how the JavaScript engine stores objects and classes, and how Shapes, Inline Caches, and ValidityCells help optimize prototype operations. Based on this knowledge, we’ve identified a practical JavaScript coding trick that can help improve performance: Don’t mess with prototypes (or if you really need to, at least do so before the rest of the code runs).

The relevant data

JavaScript engine fundamentals: optimizing prototypes

Translation plan

[JavaScript Weekly #399] Fundamentals of the JavaScript Engine (part 2) : Prototype optimization

Optimize hierarchy and execute tradeoffs

Optimize stereotype property access

Classes and prototype-based programming

Stereotype attribute access

Valid cell

The last

The relevant data

Related Posts

Writing TypeScript in React must have familiar type definitions

Data structure tutorial: two, Huffman tree (Huffman) in the front-end application

Serverless – Front-end 3.0 era