A, in this paper,

V8 was developed in C++ and used in Google chrome. Before running JavaScript, V8 compiles it to native machine code (IA-32, x86-64, ARM, or MIPS CPUs), compared to other JavaScript engines that convert it to bytecode (containing the binary of the executing program) or interpret execution. Methods such as inline caching are used to improve performance. With these features, JavaScript programs run as fast as binary programs in V8.

Second, the rounding

1. Bytecode: A binary file consisting of a sequence of OP code and data pairs that contains an executable program. It is intermediate code in which the source code is compiled by a compiler into Bytecode and translated into machine code that can be executed directly by a virtual machine on a particular platform.

Machine code: Machine code, also known as Native code, is data that a computer’s CPU can read directly. Machine code is the fastest code that a computer can execute directly. The program is all 0 and 1 composition of the instruction code

Bytecode to native machine code: the source code is first compiled into bytecode by a compiler, which is then translated into instructions that can be executed directly by a platform-specific virtual machine (such as the Java Virtual Machine). A typical use of bytecode is Java Bytecode

2. The javascript engine

Language compilation process:



Other JS engines now follow the same path: source code -> abstract syntax tree -> bytecode -> JIT -> native code

(For example, Apple’s JavascriptCore engine, introduced to SquirrelFish in 2008, implements a bytecode Register Machine; SpiderMonkey from Mozilla; Chakra, Microsoft, etc.)

Abstract syntax tree



There is no intermediate bytecode in V8 engine. The abstract syntax tree is directly converted into local code by JIT technology, and then some information is collected by Profiler to optimize the local code. Although the performance optimization in this stage is reduced, the conversion time is greatly reduced

In version 5.9 of V8, however, the Ignition bytecode interpreter was added, starting by default, to reduce the memory footprint of machine code, speed up the startup of code, and refactor V8 code to reduce code complexity.

3. The V8 engine

3.1 Version before 5.9: Literally translated into native machine code and improved performance with methods such as inline caching

3.1.1 Data representation

Javascript is an untyped language. The type of a variable cannot be determined at compile time, but only at execution time. Statically typed languages such as c++ and Java can know the type of a variable at compile time and determine the address where the variable is stored. Evaluating and determining types at run time is a cause of run-time degradation.

Variable access is very common and ordinary in the process of code execution. Js objects need to find the corresponding value through attribute name matching, which requires more operation and memory space. V8 uses a special approach: the internal representation of the data consists of the actual content of the data and a handle to the data.

The actual content of the data varies in length and type; A fixed-size handle to data that contains a pointer to the data. When a variable is accessed, the pointer in the handle can be searched and modified. V8 A handle object is 4 bytes (32-bit devices) and 8 bytes (64-bit devices), 8 bytes in javascriptCore.

Pointers in V8 fall into three categories: hidden class Pointers, which are hidden classes created by V8 for JS objects; A pointer to the property value table that the object contains; Element table pointer to an attribute that the object contains.

3.1.2 Working process

In V8, JS code is compiled when it needs to be executed, rather than all at once; This improves response time. The source code is turned into an abstract syntax tree (AST) by the parser, and then generated directly into native guest execution code using the JIT compiler’s full code generator. To improve performance, V8 uses profilers to gather information after native code is generated, and then optimizes native code based on this information, and if the optimized code performs worse, it optimizes rollback.

Before compiling, V8 builds a number of global objects and loads some built-in libraries, such as the Math library, to build the runtime environment.

3.1.3 Optimized Rollback

V8 had not been optimized for the intermediate presentation layer, i.e. compiled to determine variable types, etc., so the Crankshaft compiler was introduced for optimization analysis of hot functions, based on javascript source code. Crankshaft default code was stable and variable types remained unchanged for performance purposes, generating efficient native code; However, if the variable type changes during execution, V8 rolls back the compiler’s optimizations, recompiling from the source.

var counter =
0;

function
test(x, y) {

    counter++;

    if (counter < 1000000)
{

        // do something

        return 'jeri';

    }

    var unknown = new Date();

    console.log(unknown);

}Copy the code

The type of unknown variable is not determined until new Date() is executed, at which point V8 can only roll back this part of the code. Tuning rollback is a time-consuming operation.

3.1.4 hidden classes



Point is used to construct two objects P and Q, which have the same attribute name. V8 classifies them into the same hidden class with the same offset location information, and p and Q share this information. When accessing attributes, only the offset information of the hidden class is needed. But if, after code is executed, the object q executes q.z = 5, then P and Q no longer have the same hidden class, and Q is a new hidden class.

The hidden class conversion depends on the order in which attributes are added to the object:

function Point(x, y) {

    this.x = x;

    this.y = y;

}

var p = new Point(1, 2);

p.a = 5;

p.b = 6;

var q = new Point(3, 4);

q.b = 7;

q.a = 8;Copy the code

In this function, p and Q add attributes in different order, and the offset of the hidden class is different, which is also two different hidden classes.

3.1.5 Inline cache

The normal process of accessing an object: first get the address of the hidden class, then look up the offset based on the attribute name, and then compute the address of that attribute. Multiple variable access requires repeated execution of this process, which is also time-consuming. Therefore, V8 provides inline caching, which stores hidden classes and offsets from the first lookup so that the next lookup of the same object can omit the process of calculating the address. However, if an object has multiple attributes, the probability of cache error will be increased, because the type of an attribute changes, the hidden class of the object will also change, which is inconsistent with the previous cache and needs to be recalculated.

3.1.6 Memory Management

The V8 garbage collection mechanism limits the amount of memory that JS can use (garbage collection takes more resources and time if available memory is too large), so memory is managed: allocated and reclaimed.

The memory management group consists of two parts: allocation and reclamation. V8 memory is divided as follows:

Zone: manages small memory blocks. The Zone allocates a small memory and manages and allocates some small memory. After a small memory is allocated, it cannot be reclaimed by the Zone. Only the small memory allocated by the Zone can be reclaimed at a time. If a process requires a lot of memory, the Zone needs to allocate a large amount of memory, which cannot be reclaimed in a timely manner, resulting in insufficient memory.

Heap: Manages data used by JavaScript, generated code, hash tables, and so on. To facilitate garbage collection, the heap is divided into three parts: Young generation: Allocates memory space for newly created objects, often requiring garbage collection. To facilitate the collection of content in the young generation, you can divide the young generation into two halves, with one half for allocation and the other half for copying over objects that need to be retained before collection. Aging generation: Storing old objects, Pointers, code, and other data as needed, with less garbage collection. Large objects: Allocate memory for objects that require a large amount of memory. Of course, it may also contain memory allocated for data, code, etc. Only one object is allocated per page.



3.2 5.9

3.2.1 Reasons for introducing the Ignition bytecode interpreter



1).v8 compiles the code when it is executed, so the code needs to be parsed multiple times — the green code (in total) is parsed once, the yellow code is parsed again when new Person is called, and the red code is parsed again when doWork is called. Therefore, if the closure nested n layers, the code would have to be parsed by V8 at least n times with the correct Crankshaft budget.

2). Machine code takes up a lot of space. If you want to cache all js machine code, it will take a lot of memory and disk space. As such, Chrome’s cache works on the outermost layer of JS code, and the actual execution logic is not cached, which is why memory resources are used and wasted.

Bytecode is more compact than machine code, reducing the code memory footprint



(source: docs.google.com/presentatio…).

For the same amount of memory allocated, the memory footprint is reduced and the startup speed is faster, allowing V8 to expect all JS code to be compiled ahead of time and the bytecode to be cached and the site to be re-opened faster. This eliminates the need for Cranshaft, the old compiler, and references the new Turbofan to optimize directly from bytecode, and when needed, to de-optimize directly into bytecode instead of de-optimizing machine code into JS source code.



(source: the original docs.google.com/presentatio…).

Third, summary

From the above summary of V8 engine design features, attention should be paid to the following in the coding process:

  1. Type. Because JS is a dynamically typed language, JavaScriptCore and V8 both use hidden classes and embedded caches to improve performance. To reduce the probability of de-optimization, a function should use fewer data types; For objects, try to store the same type of data.
  2. Memory. Reclaim unused memory in time, and set objects that are no longer used to NULL.
  3. Optimize rollback. Do not nest too many layers of closures, and try not to change the object type after multiple executions.
  4. Dynamic properties. When New the same object multiple times, it is best to initialize the dynamic properties in the same order so that the hidden classes can be reused.