As a JavaScript developer, an in-depth understanding of how the JavaScript engine works helps you understand the performance characteristics of your code. This article covers some of the key basics that are common in all JavaScript engines, not just V8.

JavaScript Engine workflow

It all starts with the JavaScript code you write. The JavaScript engine parses the source code and converts it into an abstract syntax tree (AST). Based on the AST, the interpreter can start working and generate bytecode. At this point, the engine actually starts running JavaScript code.To make it run faster, the bytecode can be sent to the optimization compiler along with the analysis data. The optimization compiler makes certain assumptions based on existing analysis data and then generates highly optimized machine code.

If at some point one of the assumptions proves to be incorrect, the optimization compiler cancels the optimization and returns to the interpreter stage.

Interpreter/compiler workflow in the JavaScript engine

Now, let’s look at the part of the process that actually executes JavaScript code, the part where the code is interpreted and optimized, and discuss some of the differences between the major JavaScript engines.

Generally speaking, JavaSciript engines have a processing flow that includes an interpreter and an optimized compiler. The interpreter can quickly generate unoptimized bytecode, while optimizing the compiler takes longer, but ultimately produces highly optimized machine code.This generic flow is almost identical to V8’s Javascript engine used in Chrome and Node.js:The interpreter in V8, called Ignition, is responsible for generating and executing bytecode. As it runs bytecode, it collects analysis data that can be used later to speed up code execution. When a function becomes hot, such as when it runs frequently, the generated bytecode and analysis data are passed to our optimized compiler Turbofan to generate highly optimized machine code based on the analysis data.SpiderMonkey, the JavaScript engine Mozilla uses in Firefox and Spidernode, is different. They have two optimized compilers instead of one. The interpreter starts by generating some optimized code through the Baseline compiler. Then, combined with the analysis data collected when the code is run, the IonMonkey compiler can generate more highly optimized code. If the optimization attempt fails, IonMonkey will return to the code at the Baseline stage.

Chakra, Microsoft’s JavaScript engine used in Edge, is very similar in that it also has two optimized compilers. The interpreter optimizes the code to SimpleJIT (JIT stands for just-in-time compiler, just-in-time compiler), and SimpleJIT generates slightly optimized code. FullJIT combines analysis data to generate more optimized code.JavaScriptCore (abbreviated JSC), Apple’s JavaScript engine used in Safari and React Native, takes it to the extreme with three different optimized compilers. The low-level interpreter LLInt optimizes code to the Baseline compiler, and then optimizes code to the DFG (Data Flow Graph) compiler, The DFG (Data Flow Graph) compiler can then pass optimized code to the FTL (Faster Than Light) compiler.

Why do some engines have more optimized compilers? It’s a balance of pros and cons. Interpreters can generate bytecodes quickly, but bytecodes are often inefficient. Optimizing the compiler, on the other hand, takes longer, but ultimately results in more efficient machine code. There is a trade-off between getting the code running quickly (the interpreter) or taking more time but ultimately running the code for best performance (optimizing the compiler). Some engines choose to add multiple optimized compilers with different time/efficiency characteristics, allowing more fine-grained control over these trade-offs at the cost of additional complexity. Another tradeoff has to do with memory usage, which will be covered in a future article.

We’ve just highlighted the major differences in the interpreter and optimized compiler flow in each JavaScript engine. Despite these differences, at a high level, all JavaScript engines have the same architecture: a parser and some sort of interpreter/compiler flow.

JavaScript object model

Let’s zoom in on some aspects of the implementation to see what else JavaScript engines have in common.

For example, how do JavaScript engines implement the JavaScript object model, and what tricks do they use to speed up access to properties of JavaScript objects? It turns out that all the major engines implement this very similarly.

The ECMAScript specification basically defines all objects as dictionaries of string key values mapped to property properties.

In addition to [[Value]] itself, the specification defines these attributes:

  • [[Writable]] determines whether the property can be reassigned,
  • [[Enumerable]] determines if an attribute appears in a for in loop,
  • [[64x]] determines whether the property can be deleted.

The symbolic representation of [[double parentheses]] looks a bit peculiar, but this is exactly how the specification defines attributes that cannot be directly exposed to JavaScript. In JavaScript you can still pass the Object. GetOwnPropertyDescriptor API to obtain the specified Object attribute values:

const object = { foo: 42 };
Object.getOwnPropertyDescriptor(object, 'foo');
// → {value: 64, writable: true, Enumerable: true, different: true}
Copy the code

That’s how JavaScript defines objects, but what about arrays?

You can think of an array as a special object, but one of the differences is that arrays do special things to array indexes. The array index here is a special term in the ECMAScript specification. In JavaScript, an array is limited to 2³²−1 elements, and the array index is any valid index within that range, that is, any integer from 0 to 2³²−2.

Another difference is that arrays also have a special length attribute.

const array = ['a'.'b'];
array.length; / / - > 2
array[2] = 'c';
array.length; / / - 3
Copy the code

In this example, the array is created with a length of 2. When we assign another element to the position with index 2, the length is automatically updated.

JavaScript defines arrays in a similar way to objects. For example, all key values, including the index of an array, are explicitly represented as strings. The first element in the array is stored under the key ‘0’.The “Length” property is another non-enumerable and non-configurable property. When an element is added to an array, JavaScript automatically updates the [[value]] property of the “Length” property.

Optimizing property access

Now that you know how objects are defined in JavaScript, let’s take a closer look at how JavaScript engines use objects efficiently. In general, accessing properties is by far the most common operation in JavaScript programs. Therefore, it is critical that the JavaScript engine be able to access properties quickly.

Shapes

In JavaScript programs, it is common for multiple objects to have the same key-value attribute. We can say that these objects have the same shape.

const object1 = { x: 1.y: 2 };
const object2 = { x: 3.y: 4 };
// object1 and object2 have the same shape.
Copy the code

It is also very common to access the same properties of objects with the same shape:

function logX(object) {
    console.log(object.x);
}

const object1 = { x: 1.y: 2 };
const object2 = { x: 3.y: 4 };

logX(object1);
logX(object2);
Copy the code

With this in mind, JavaScript engines can optimize access to objects’ properties based on their shapes. Here’s how it works.

Suppose we have an object with attributes X and y that uses the dictionary data structure we discussed earlier: it contains strings of keys that point to their respective attribute values.

If you access a property, such as Object. y, the JavaScript engine looks for the key ‘y’ in JSObject, loads the corresponding property Value, and returns [[Value]].

But where are these property values stored in memory? Should we store them as part of JSObject? Assuming we’ll encounter more objects of the same shape later, storing a full dictionary of property names and property values in JSObject itself would be wasteful, since property names are repeated for all objects with the same shape. This is a lot of duplication and unnecessary memory usage. As an optimization, the engine stores the Shape of the object separately.Shape contains all attribute names and attributes except for [[Value]]. In addition, Shape contains an offset of the internal value of JSObject so that the JavaScript engine knows where to look for the value. Every JSObject with the same shape points to that Shape instance. Now each JSObject only needs to store values that are unique to that object.The benefits become obvious when we have multiple objects. No matter how many objects there are, as long as they have the same shape, we only need to store the shape and property information once!

All JavaScript engines use Shapes as optimizations, but with different names:

  • Academic papers refer to them as Hidden Classes (easily confused with Classes in JavaScript)
  • V8 calls them Maps (easily confused with Maps in JavaScript)
  • Chakra calls them Types (easily confused with dynamic typing and Typeof in JavaScript)
  • JavaScriptCore calls them Structures
  • SpiderMonkey calls them Shapes

For this article, we will continue to use the term Shapes.

Transformation chains and trees

What happens if you have an object with a particular Shape and you add a property to it? How does the JavaScript engine find this new shape?

const object = {};
object.x = 5;
object.y = 6;
Copy the code

These shapes form so-called Transition chains in the JavaScript engine. Here’s an example:

The object starts out with no attributes, so it points to an empty Shape. The next statement adds a property “x” with a value of 5 to the object, so the JavaScript engine turns to a Shape containing the property “x” and adds a value of 5 to JSObject at the first offset of 0. The next line adds an attribute ‘y’, the engine turns to another shape containing ‘x’ and ‘y’, and adds the value 6 to JSObject (at offset 1).

We don’t even need to store a complete property table for each shape. Instead, each Shape only needs to know about the new properties it introduces. For example, in this case, we don’t have to store information about “x” in the last shape, because it can be found on the earlier chain. To do this, each shape links back to its previous shape:

If you write O.x in your JavaScript code, the JavaScript engine will look for the attribute “x” along the transformation chain until it finds the Shape that introduced the attribute “x”.

But what if there is no way to create a chain of transformations? For example, what if you have two empty objects and you add different attributes to each object?

const object1 = {};
object1.x = 5;
const object2 = {};
object2.y = 6;
Copy the code

In this case, we have to branch, and we end up with a transformation tree instead of a transformation chain.

Here, we create an empty object A and add an attribute ‘x’ to it. Finally, we end up with a JSObject that contains unique values and two Shapes: an empty Shape and a Shape that contains only the property X.

The second example also starts with an empty object B, but we add a different attribute ‘y’ to it. We end up with two chains of shapes, three shapes in total.

Does that mean we always need to start with an empty shape? Not necessarily. The engine does some optimization for object literals that already contain attributes. For example, we either add the X attribute starting with an empty object literal, or we have an object literal that already contains the attribute X:

const object1 = {};
object1.x = 5;
const object2 = { x: 6 };
Copy the code

In the first example, we start with an empty Shape and move to a shape that contains x, as we have seen before.

In the object2 example, it makes sense to generate the object with the X attribute directly from the beginning, rather than an empty object.

Object literals containing the attribute ‘x’ start with shapes containing ‘x’, effectively skipping empty shapes. V8 and SpiderMonkey (at least) did just that. This optimization shortens the transformation chain and makes it more efficient to build objects from literals.

Here is an example of a 3D point object with attributes’ x’, ‘y’, and ‘z’.

const point = {};
point.x = 4;
point.y = 5;
point.z = 6;
Copy the code

As we saw earlier, this creates an object in memory with three shapes (not counting empty shapes). When accessing the object’s property ‘x’, for example, if you write Point. X in your program, the javaScript engine needs to follow the list of links: it will start with the shape at the bottom and work its way up until it finds the shape with ‘x’ at the top.

As such operations become more frequent, the speed becomes very slow, especially if the object has many attributes. The time complexity of finding attributes is O(n), which is linearly related to the number of attributes on the object. To speed up the search for attributes, the JavaScript engine adds a ShapeTable data structure. The ShapeTable is a dictionary that maps property keys to shapes that describe the corresponding properties.

Now we’re back in the dictionary ~~~~ we added shape to optimize this! So why are we obsessed with Shape? The reason is that Shape enables another optimization called Inline Caches.

Inline Caches (ICs)

The main motivation behind Shapes is the concept of Inline Caches or ICs. ICs is the key to making JavaScript run fast! JavaScript engines use ICs to store information about where object properties are found to reduce the number of expensive lookups.

Here is a function getX that takes an object and loads the property X from it:

function getX(o) {
    return o.x;
}
Copy the code

If we run this function in JSC, it produces the following bytecode:

The first get_by_id instruction loads the attribute ‘x’ from the first argument (arg1) and stores the result in loc0. The second instruction returns the stored contents to loc0.

The JSC also inserts an Inline Cache into the get_by_id directive, which consists of two uninitialized slots.

Now, let’s assume that getX is executed with an object {x: ‘a’}. As we know, this object has a Shape containing the property ‘x’, which stores the offset and properties of the property ‘x’. When you first execute this function, the get_by_id directive looks for the attribute ‘x’ and finds its value stored at an offset of 0.

The IC embedded in the get_by_id directive stores shape and the offset for this property:

For subsequent runs, IC only needs to compare the shape and, if the shape is the same as before, simply load the value from the stored offset. Specifically, if the JavaScript engine sees that an object’s Shape was previously recorded by the IC, it doesn’t need to touch the property information at all, and can skip the expensive property information lookup altogether. This is much faster than looking up attributes every time.

Efficient storage array

For arrays, it is common to store array index attributes. The values of these attributes are called array elements. Storing attributes for each array element in each array is very memory wasteful. Instead, array index attributes are writable, enumerable, and configurable by default, and the JavaScript engine stores array elements separately from other named attributes based on this.

Consider the following array:

const array = [
    '#jsconfeu',];Copy the code

The engine stores the array length (1) and points to a Shape containing the offset and ‘length’ attributes.

It’s similar to what we’ve seen before… But where do the values of the array go?

Each array has a separate element backup store containing the attribute values of all array indexes. The JavaScript engine does not have to store any attribute attributes for array elements, because they are generally writable, enumerable, and configurable.

So what happens in the unusual case? What if you change the attribute properties of an array element?

// Please don’t ever do this!
const array = Object.defineProperty(
    [],
    '0',
    {        
        value: 'Oh noes!! 1 '.writable: false.enumerable: false.configurable: false});Copy the code

The code snippet above defines a property named “0” (which happens to be an array index), but sets its property to a non-default value.

In this edge case, the JavaScript engine represents the entire element backup store as a dictionary that maps array indexes to attribute properties.

Even if only one array element has a non-default feature, the entire array’s backup storage will fall into this slow and inefficient mode. Avoid using Object.defineProperty for array indexes!

advice

We’ve seen how the JavaScript engine stores objects and arrays, and how Shape and ICs optimize common operations on them. Based on this knowledge, we identified some practical JavaScript coding tricks that can help improve performance:

  • Always initialize objects the same way so that they do not have different shapes.
  • Do not mess with the attribute properties of array elements so that they can be stored and manipulated efficiently.