Teach girlfriend to learn front-end in-depth understanding of JS engine

Delicious value: 🌟🌟🌟🌟🌟

Taste: Fat beef with tomato

Canteen owner: boss, Chrome V8 engine working principle interview will ask?

Cafeteria owner: This is not only something you might ask in an interview, but also how the JS engine works, how JavaScript works, how Babel lexical analysis and syntax analysis work in the front-end ecosystem, how ESLint grammar checks and how React, Vue and other front-end frameworks work. In short, learning the engine principle can be said to kill many birds with one stone.

Dining room owner’s wife: good good, don’t wordy, fast start ~

Look at V8 from a macro perspective

V8 is our frontend. It’s written in C++ and is Google’s open source, high-performance JavaScript and WebAssembly engine. It’s used in Chrome, node.js, Electron… In the.

Before we start talking about our main character, V8, let’s take a macro view of where V8 stands and build a worldview.

Today, with the rapid development of information technology, this crazy big world is filled with all kinds of electronic devices. We use mobile phones, computers, electronic watches, smart speakers and now more and more electric cars on the road.

As software engineers, we can understand them as a unified “computer”, they are composed of central processing unit (CPU), storage, and input and output devices. The CPU acts like a cook, taking orders from a recipe. Storage is like a refrigerator, responsible for storing data and commands to be executed (food).

When the computer is powered on, the CPU begins to read instructions from a storage location and execute the commands one by one according to the instructions. Computers can also be connected to a variety of external devices, such as: mouse, keyboard, screen, engine and so on. The CPU doesn’t have to figure out all the capabilities of these devices, it’s just responsible for exchanging data with the ports on these devices. Device manufacturers also provide hardware software to work with the CPU. Here we have the basic computer, the architecture that von Neumann, the father of the computer, proposed in 1945.

But because machine instructions were so unfriendly to humans that they were hard to read and remember, programming languages and compilers were invented. Compilers can translate languages that are easier for humans to understand into machine instructions. In addition, we also need the operating system to help us solve the problem of software governance. As we know, there are many operating systems, such as Windows, Mac, Linux, Android, iOS, Hongmeng, etc., and there are countless devices using these operating systems. In order to eliminate the diversity of clients, cross-platform and provide a uniform programming interface, the browser was born.

So, we can think of the browser as an operating system on top of an operating system, and for the JavaScript code that front-end engineers are most familiar with, the browser engine (e.g. V8) is the world.

Planet’s most powerful JavaScript engine

Without a doubt, V8 is the most popular and powerful JavaScript engine out there, and its name was inspired by the engines of the classic ‘muscle cars’ of the 1950s.

Programming Languages Software Award

V8 has also been recognized by the academic community and won ACM SIGPLAN Programming Languages Software Award.

Mainstream JS engine

The main JavaScript engine looks like this:

V8 (Google)
SpiderMonkey (Mozilla)
JavaScriptCore (Apple)
Chakra (Microsoft)
duktape(IOT)
JerryScript(IOT)
QuickJS
Hermes(Facebook-React Native)

V8 Release Cycle

The V8 team uses four Chrome release channels to push new releases to users.

Canary Releases (Daily)
Dev releases 开发版 (每周)
Releases Beta Releases Beta Releases
Stable Releases (every 6 weeks)

For more information, check out the V8 release process.

The history of V8 architecture

V8 opened source on the same day as Chrome on September 2, 2008, with the original code submission date dating back to June 30, 2008. You can see a visual evolution of the V8 codebase at the link below.

Visualize the evolution of the V8 code base created with Gource

The V8 architecture at the time was crude, with a single Codegen compiler.

In 2010, the Crankshaft optimization compiler was added to the V8, which significantly improved runtime performance. Crankshaft generated machine code that was twice as fast as the previous Codegen compiler while reducing the volume by 30%.

In 2015, to further improve performance, V8 introduced the TurboFan optimizer compiler.

Next comes the watershed, and until then V8 had chosen an architecture that compiled the source code directly into machine code. But as Chrome became more popular on mobile devices, the V8 team discovered that the architecture had fatal problems: it took too long to compile and the machine code took too much memory.

So the V8 team refactored the engine architecture, introducing the Ignition interpreter and bytecode in 2016.

In 2017, the new V8 build pipeline(Ignition + TurboFan) will be switched on by default and full-CodeGen and Crankshaft will be removed.

Not only does a high performance JS engine require a highly optimized compiler like TurboFan, but there is plenty of room for optimization before the compiler has a chance to start working.

In 2021, V8 introduced a new build pipeline, the Sparkplug.

To learn more about Sparkplug, tap Sparkplug

For more on the history of V8 architecture, please celebrate V8’s 10th anniversary

Dining room lady: The ORIGINAL V8 architecture has gone through so many changes

Canteen owner: Yes, the V8 team has made great efforts to continuously optimize the performance of the engine.

V8 working mechanism

Knock on the blackboard, enter the focus of this article.

Dining room boss wife: take out a small notebook to remember

V8’s core process of executing JavaScript code is divided into two phases:

compile
perform

The compilation phase is when V8 converts JavaScript into bytecode or binary machine code, and the execution phase is when the interpreter interprets and executes the bytecode, or the CPU executes the binary machine code directly.

To get a better understanding of how V8 works as a whole, let’s understand a few concepts.

Machine language, assembly language, high-level language

The INSTRUCTION set of the CPU is machine language, and the CPU can only recognize binary instructions. But for humans, binary is difficult to read and remember, so people convert binary into a language that can be recognized and remembered, namely assembly language. Assembly instructions can be converted into machine instructions by an assembly compiler.

Different cpus have different instruction sets. Assembly language programming needs to be compatible with different CPU architectures, such as ARM and MIPS, and the learning cost is relatively high. The abstraction of assembly language was not enough, so high-level languages emerged, which shielded the details of computer architecture and were compatible with many different CPU architectures.

Cpus also do not know high-level languages. There are two ways to execute high-level language code:

Explain to perform
Compile implementation

Interpret execution, compile execution

Interpretative execution will first compile the input source code into intermediate code through the parser, and then directly use the interpreter to interpret and execute the intermediate code, and output the result.

Compilation execution also converts the source code into intermediate code, which is then compiled by the compiler into machine code, usually stored as a binary file, which is executed to output the result. The compiled machine code can also be stored in memory, and the in-memory binary code can be executed directly.

JIT (Just In Time)

Interpretation execution is fast to start and slow to execute, while compilation execution is slow to start and fast to execute.

V8 uses both interpreted and compiled execution in a tradeoff known as JIT (Just-in-time Compilation).

When V8 executes JavaScript source code, the parser first parses the source code into an AST abstract syntax tree, and the interpreter (Ignition) converts the AST into bytecode and executes it as it is interpreted.

The interpreter also records the number of times a piece of Code has been executed. If the number of times it has been executed exceeds a certain threshold, the Code is marked as Hot Code. This information is then fed back to TurboFan, the optimizer compiler, which then optimizes and compiles the bytecode to generate optimized machine Code.

Dining room lady: So when this code is executed again, the interpreter can run the optimized machine code directly without having to explain it again. This will improve performance a lot, right?

Canteen owner: Right!

The name of the V8 interpreter and compiler has interesting connotations. The interpreter Ignition stands for Ignition, and the compiler TurboFan stands for turbo. When code starts, it starts with the Ignition.

Now that you’ve seen how V8 works in general, let’s dig a little deeper and take a look at how V8’s core modules work.

Working principle of the V8 core module

V8’s core modules include:

The Parser is responsible for converting JavaScript code into an AST abstract syntax tree.
Ignition: The interpreter converts the AST to bytecode and gathers the optimization compilation information TurboFan needs.
TurboFan: Converts bytecode to optimized machine code using information gathered by the interpreter.

V8 doesn’t run code until it’s compiled, so performance during parsing and compilation is important.

The Parser Parser

The parsing process of a parser is divided into two stages:

Lexical analysis
Parsing (pre-parser, Parser Parser)

Lexical analysis

The Scanner is responsible for receiving Unicode Stream characters, parsing them into tokens and supplying them to the Parser.

Take the following code for example:

let myName = 'Baby oba'
Copy the code

Will be resolved to let, myName, =, ‘tooba’, which are keywords, identifiers, assignment operators, and strings, respectively.

Syntax analysis

Next, the parsing will convert the tokens generated in the previous step into an AST according to the grammar rules. If there is a syntax error in the source code, this stage will terminate and throw a syntax error.

You can view the structure of the AST through this website: astexplorer.net/

Can also through the link resources.jointjs.com/demos/javas… Generate the image directly, as shown below:

Given the AST, V8 generates an execution context for that code.

Inert parsing

Major JavaScript engines use Lazy Parsing, because it takes too long to execute and consumes more memory and disk space if the source code is fully parsed before execution.

Lazy parsing means that if a function is encountered that is not immediately executed, it is only pre-parsed (pre-parser), and only fully parsed when the function is called.

The pre-parser validates the syntax of the function, parses the function declaration, and determines the scope of the function. It does not generate an AST, which is done by the pre-parser.

The interpreter Ignition

Given the AST and execution context, the interpreter then converts the AST into bytecode and executes it.

Dining room lady: Why introduce bytecode?

Introducing bytecode is an engineering tradeoff, as you can see from the figure that the machine code generated by just a few kilobytes takes up a lot of memory space.

Not only does bytecode use less memory than machine code, but it also takes a lot of time to generate bytecode, improving startup speed. While bytecode is not as fast as machine code, the cost of execution is well worth it.

In addition, bytecode is independent of a particular type of machine code and can be converted to machine code by the interpreter, making V8 easier to port across different CPU architectures.

You can view the bytecode generated by JavaScript code by running the following command.

node --print-bytecode index.js
Copy the code

It can also be viewed via the following link:

Header file for the V8 interpreter, including all bytecode

Let’s look at some code:

// index.js
function add(a, b) {
    return a + b
}

add(2.4)
Copy the code

After executing the command, the above code generates the following bytecode:

[generated bytecode for function: add (0x1d3fb97c7da1 <SharedFunctionInfo add>)]
Parameter count 3
Register count 0
Frame size 0
   25 S> 0x1d3fb97c8686 @    0 : 25 02             Ldar a1
   34 E> 0x1d3fb97c8688 @    2 : 34 03 00          Add a0, [0]
   37 S> 0x1d3fb97c868b @    5 : aa                Return
Constant pool (size = 0)
Handler Table (size = 0)
Source Position Table (size = 8)
0x1d3fb97c8691 <ByteArray[8]>
Copy the code

Where Parameter count 3 represents the three parameters passed in, including a, b and this. The bytecode details are as follows:

Ldar a1 // loads the value from the register into the accumulator
Add a0, [0] // Load the value from the A0 register and add it to the value in the accumulator, and then put the result into the accumulator again
Return // Ends execution of the current function and passes control to the caller, returning the value in the accumulator
Copy the code

Each line of bytecode corresponds to a specific function, and the lines of bytecode are built like Lego blocks to form a complete program.

There are usually two types of interpreters, stack-based and register-based. The early V8 interpreters were also stack-based, and today’s V8 interpreters have a register-based design that supports instruction operations in registers and uses registers to hold parameters and intermediate calculations.

When executing bytecode, the Ignition interpreter mainly uses a general purpose register and an accumulator register, where function parameters and local variables are stored, and the accumulator holds intermediate results.

During instruction execution, the CPU needs to read and write data. If the data is read and written directly in the memory, the execution performance of the program will be seriously affected. So the CPU introduced registers, and stored some intermediate data in the registers, to improve the CPU execution speed.

The compiler TurboFan

On the compile side, the V8 team also did a lot of optimization, so let’s look at inlining and escape analysis.

Inline inlining

For inlining, let’s look at some code:

function add(a, b) {
  return a + b
}
function foo() {
  return add(2.4)}Copy the code

As shown in the code above, we have called the add function in the foo function. The add function takes both a and b arguments and returns their sum. Without compiler optimization, machine code for each function is generated separately.

To improve performance, the TurboFan optimization compiler inlines the above two functions and then compiles them. The inlined function looks like this:

function fooAddInlined() {
  var a = 2
  var b = 4
  var addReturnValue = a + b
  return addReturnValue
}

}}}}}}}}}}}}}}}}}}}}}}}}}}}}
function fooAddInlined() {
  return 6
}
Copy the code

With inline optimization, the compiled machine code is much smaller and the execution efficiency is greatly improved.

Escape Analysis

Escape analysis is also easy to understand. It means to analyze whether the life cycle of the object is limited to the current function. Let’s look at the following code:

function add(a, b){
  const obj = { x: a, y: b }
  return obj.x + obj.y
}
Copy the code

If the object is defined only inside the function, and the object only acts inside the function, it is considered “not escaped.” We can optimize the above code:

function add(a, b){
  const obj_x = a
  const obj_y = b
  return obj_x + obj_y
}
Copy the code

After optimization, there are no more object definitions, and we can load variables directly into registers without having to access object properties from memory. It not only reduces the memory consumption, but also improves the execution efficiency.

As for escape analysis, Chrome has also revealed a security flaw that slowed down the entire Internet. Please be interested in poking a bug from the V8 team that slowed down the entire Internet

In addition to the various optimization schemes and modules mentioned above, V8 also has many optimization methods and core modules, such as: Using hidden classes to quickly get object properties, using inline caching to improve function execution, Orinoco garbage collector, Liftoff WebAssembly compiler, etc., are not covered in this article.

summary

This paper introduces and summarizes V8, the evolution history of V8 architecture, the working mechanism of V8 and the working principle of V8 core modules from a macro perspective. We can find that both Chrome and Node.js are just a bridge. It is responsible for shipping JavaScript code written by our front-end engineers to the final destination, converting it into machine code for the corresponding machine and executing it. During this journey, the V8 team made a great effort to give them maximum respect.

Although the CPU’s instruction set is limited, the programs we software engineers write are not fixed, and it is these programs that end up being executed by the CPU that have the potential to change the world.

You are the best, world-changing apps!

Lady: Tong Tong, you are the fattest! ^_^

Standing on the shoulders of giants

How does V8 execute JavaScript code? Rayken –
Google V8 – Li Bing
Architecture class by Xu Shiwei
v8.dev/blog

❤️ love triple strike

1. If you think the food and wine in the canteen are palatable, you can click “like” to support it. Your “like” is my biggest motivation.

2. Pay attention to the front canteen of the public account and eat every meal!

3. Like, comment, retweet === urge more!