V8What is the

JavaScript is an interpreted scripting language that is dynamically typed, weakly typed, prototype-based, with built-in support for types. Its interpreter is called a JavaScript engine and is part of the browser.

The V8 engine is the interpreter that JavaScript runs on.

Interpretive language

  • The source code can not be directly translated into machine language, but first translated into intermediate code, and then interpreted by the interpreter to run the intermediate code;

  • Programs do not need to be compiled; they are translated into machine language at runtime, every time they are executed. Interpretive languages: Python, JavaScript, Shell, Ruby, MATLAB, etc.

Compiled language

  • The source code is compiled into machine language only once, and subsequent execution is not recompiled.Just use the previous compilation results; Therefore, its execution efficiency is relatively high; Compiled languages represent:C,C++.

source

The name V8 comes from the car’s v-8 engine. The V8 engine was primarily an American development and is known for its power. The V8 engine’s name is Google’s way of showing users that it is a powerful and fast JavaScript engine.

background

Before V8, the early mainstream JavaScript engine was the JavaScriptCore engine. JavaScriptCore mainly serves the Webkit browser kernel, which is developed and open source by Apple. Because Google is not satisfied with the development speed and running speed of JavaScriptCore and Webkit, Google started to develop a new JavaScript engine and browser kernel engine, so V8 and Chromium engines were born, which have become the most popular browser-related software up to now.

JavaScriptCore && V8compile

V8

The process for the compile phase is: sourceCode -> AST -> native code. The process from abstract syntax tree to native code uses JIT full code generator, whose function is to convert abstract syntax tree into native code that can be run directly by each hardware platform. Is it like C/C++?

JavaScriptCore

The process for the compile phase is: sourceCode -> AST -> bytecode -> native code. This phase of the compilation process is similar to Java, except there is no time to optimize. As a result, a large number of bytecode optimization efforts, such as JIT, have been postponed. The JavaScriptCore engine continues to optimize bytecode using DFG JIT and LLVM.

The service object

V8 was developed based on Chrome, and since its performance is very good, it has gradually been widely used, such as the popular NodeJS, Weex, Fast application, and early RN.

V8The early architecture

V8 is optimized primarily for speed and memory reclamation.

The architecture of JavaScriptCore is to generate bytecode and then execute bytecode. Google decided that JavaScriptCore was not a good architecture, and generating bytecode was a waste of time, rather than generating machine code directly. So V8 was very radical in its early architectural design, compiling directly to machine code. Later practice proved that the speed of Google’s architecture was improved, but it also caused memory consumption problems. Take a look at V8’s initial flow chart:

Early V8s had both full-CodeGen and Crankshaft compilers. V8 first compiles all the code once using full-CodeGen to generate the corresponding machine code. As JS executed, V8’s built-in Profiler filtered out the hot function and logged the parameter feedback types to the Crankshaft for optimization. So full-codeGen would essentially generate unoptimized machine code, and Crankshaft would generate optimized machine code.

defects

As versions were introduced and web pages became more complex, V8 began to reveal its own architectural flaws:

  • Full-CodegenCompilation directly generates machine code, resulting in a large memory footprint
  • Full-CodegenCompilation directly generates machine code, resulting in long compilation time and slow startup
  • CrankshaftUnable to optimizetry catchfinallyAnd so on the keyword partition code block
  • CrankshaftNew syntax support, need to write to adapt to different Cpu architecture code

V8The existing structure

To address these shortcomings, V8 uses the architecture of JavaScriptCore to generate bytecode. Does it feel like Google is circling back here? V8 uses bytecode generation, and the overall process is shown as follows:

Ignition is V8’s interpreter, and the original motivation behind it is to reduce memory consumption on mobile devices. Prior to Ignition, V8’s full-CodeGen baseline compiler typically generated nearly a third of Chrome’s overall JavaScript heap. This leaves less room for the actual data of the Web application.

The bytecode generated by Ignition can be directly used to generate optimized machine code with TurboFan, rather than having to recompile from source as Crankshaft did. Ignition’s bytecode provides a clearer and less error-prone baseline execution model in V8, simplifying de-tuning mechanisms, a key feature of V8 adaptive optimization. Finally, since generating bytecode is faster than generating full-CodeGen’s baseline compiled code, activating Ignition generally improves script startup time, which in turn improves web page loading.

TurboFan is an optimized compiler for V8, and the TurboFan project was originally launched in late 2013 to address the Crankshaft’s shortcomings. Crankshafts can only optimize a subset of the JavaScript language. For example, it is not designed to optimize JavaScript code with structured exception handling, that is, blocks of code divided by JavaScript’s try, catch, and finally keywords. It was difficult to add support for new language features in your crankshafts, as these almost always required architecture-specific code for the nine supported platforms.

Advantages of adopting the new architecture

Conclusion: It was obvious that the Ignition+TurboFan architecture had more than half the memory reduction compared to the Full-CodeGen +Crankshaft architecture.

Let’s take a look at each flow of the existing architecture:

V8Lexical analysis and grammatical analysis

The JS file is only a source code, which cannot be executed by the machine. Lexical analysis is to divide the source code into strings and generate a series of tokens. As shown in the following figure, different strings correspond to different token types.

After lexical analysis, the next stage is grammatical analysis. Syntax analysis The input of syntax analysis is the output of lexical analysis, and the output is the AST abstract syntax tree. V8 throws an exception during parsing when the program has a syntax error.

V8AST Abstract syntax treeAST object documentation

Tree structure

After the V8 Parse phase, the next step is to generate bytecode from the abstract syntax tree. The add function generates the corresponding bytecode as shown below:

The BytecodeGenerator class is used to generate the corresponding bytecode from the abstract syntax tree. Each node has a bytecogenerator function that starts with Visit**. Function bytecode generation corresponding to + as shown below:

void BytecodeGenerator::VisitArithmeticExpression(BinaryOperation* expr) {
  FeedbackSlot slot = feedback_spec() - >AddBinaryOpICSlot(a); Expression* subexpr; Smi* literal;if (expr->IsSmiLiteralOperation(&subexpr, &literal)) {
    VisitForAccumulatorValue(subexpr);
    builder() - >SetExpressionPosition(expr);
    builder() - >BinaryOperationSmiLiteral(expr->op(), literal,
                                         feedback_index(slot));
  } else {
    Register lhs = VisitForRegisterValue(expr->left());
    VisitForAccumulatorValue(expr->right());
    builder() - >SetExpressionPosition(expr);  // Save the source location for debugging
    builder() - >BinaryOperation(expr->op(), lhs, feedback_index(slot)); // Generate Add bytecode}}Copy the code

There is a source code location record, and the following diagram shows the source code and bytecode locations:

The bytecode

The Ignition engine interprets and executes bytecode, which means it functions like a Java JVM, essentially a virtual machine.

There are usually two types of virtual machines: Stack and Register. For example, Stack virtual machines with JVMS are a more general implementation method, whereas Ignition in the V8 engine is Register, or register-based virtual machine. Register-based VMS typically execute faster than stack-based VMS, but the instructions are relatively long.

First, V8 bytecode:

  1. Each bytecode specifies its input and output as register operands
  2. IgnitionRegister r0, R1, R2… And the Accumulator register
  3. Register: Function parameters and local variables are stored in registers visible to the user

Accumulator: a non-user visible register that holds intermediate results

Bytecode execution

  1. Start by converting the source code to bytecode
  1. Initialize f
  1. Storing the small integer -100 in the accumulator, LdaSmi can be interpreted as a defined handle function whose argument is followed by #100
  1. Sum the 150 stored in A2 with the values in the accumulator and store the result in the accumulator.
  1. Save the 50 stored in the accumulator into register R0, which has a value of 50.
  1. The value of register A1, parameter B, is stored in the accumulator with a value of 2. (A0, A1, a2 are also registers)
  1. Multiply the value in register R0 and the value in the accumulator and store the result in the accumulator.
  1. Sum the value in register A0 with the value in the accumulator and store the result in the accumulator.
  1. Return (including the above statement) is itself a defined handle function. Return stands for returning the value of the accumulator.

With newer versions of V8, the resulting bytecode structure may differ slightly from version to version, but these defined functions generally do not change much.

Turbofan

TurboFan generates optimized machine code based on bytecode and hotspot function feedback types. Much of TurboFan’s optimization process is basically the same as the back end optimization of compilation principle, using sea-of-Node.

Add function optimization:

function add(x, y) {
  return x+y;
}
add(1.2);
%OptimizeFunctionOnNextCall(add);
add(1.2);
Copy the code

V8 is a function can be called directly to specify which function optimization, and perform % OptimizeFunctionOnNextCall active call TurboFan optimize the add function, according to the last call to optimize the parameters of the feedback the add function, obviously this feedback is integer, So TurboFan optimizes the arguments to be integers to generate the machine code directly, and the next function call calls the optimized machine code directly. (note that perform V8 need plus – allow – natives – syntax, OptimizeFunctionOnNextCall for built-in functions, only add – allow – natives – syntax, JS can invoke the built-in function, or perform an error).

The JS add function generates the corresponding machine code as follows:

The garbage collection

All objects in V8 are allocated by the heap, and when the code declares a variable and assigns a value, the object’s memory is allocated to the heap. If the heap is out of memory, it continues to apply until the V8 limit is reached. V8’s garbage collection action is triggered.

V8 adopts a generational collection strategy, which divides the heap memory into different generations and executes different garbage collection algorithms according to the characteristics of each generation. V8 will deal mainly with the New generation and old generation divisions.

Cenozoic is characterized by small area and frequent recycling. The Scavenge algorithm is used to exchange space for time. The old generation is characterized by long object life cycle and large memory consumption. The strategy of mark-sweep and Mark-compact is mainly used to save space.

reference

  1. V8.dev/docs
  2. Tc39. Es/ecma262 / # se…
  3. The attachedV8Byte clock