preface

This article is the third article of the V8 engine, a series of key content is about understanding the concept of bytecode, and bytecode in important effect in the process of evolution V8 engine, at the same time help you comb V8 architecture to help you better understand V8 engine architecture, article series will be finished at the end of the link, this series of articles are updated welcome continued attention.

Bytecode concept

What is bytecode

The bytecode description in Wikipedia is as follows:

Bytecode (English: Bytecode) usually refers to code that has been compiled, but is independent of specific machine code, and needs to be translated by an interpreter to become intermediate code in the machine code. Bytecodes are usually not readable like source code, but encoded sequences of numeric constants, references, instructions, and so on.

According to the author’s understanding of bytecode is something like this: Computers can only read binary code, and binary code (instruction sets) is not suitable for humans to write and read. Different CPU architectures correspond to completely different instruction sets. To overcome this problem, the great powers created human-friendly languages, which are called “high-level” languages. These high-level languages are very close to the use of human natural language and mathematical formulas, regardless of CPU architecture differences. The difference between high-level languages and binary code is so large that direct conversion would be cumbersome, and here comes the code in between — bytecode.

Second, the advantages of bytecode

The most intuitive way to understand the benefits of bytecode is to look directly at what bytecode brings to Java. The early Java promotion slogan was Compile Once, Run Anywhere. Java source code was compiled by a compiler to generate bytecode files with a.class extension. The bytecode is then translated into the machine’s computer instructions via the JVM(the target machine must have the corresponding JVM(Java Virtual Machine) installed). The way the Java language uses bytecodes partly solves the problem of inefficient execution of interpreted languages, and because bytecodes are not specific to a particular machine, Java programs can run on many different computers without recompiling.

The advantages of bytecode can be summarized as follows:

  • No specific CPU architecture
  • Faster than the original high-level language to machine language

The evolution of the V8

V8’s early architecture

Before V8, the most popular early JavaScript engine was the JavaScriptCore engine. JavasSriptCore works by generating bytecode and converting bytecode to binary code. V8 was created to achieve extreme performance. Google felt that bytecode generation in this architecture was a waste of time.

Blog.itpub.net/69912579/vi…

Let’s take a look at how the early V8 architecture executed JS code:

  • The first step is to convert the JS source code into an AST (Abstract syntax tree)
  • The second step is to compile the AST into binaries using the full-CodeGen engine, and then execute the binaries directly.
  • Third, during the execution of the binary, the function was tagged with repeated execution, and the tagged code was optimized and compiled through the Crankshaft engine to produce a more efficient binary that would be used when the function was run again.

It also uses a strategy of caching binary code (in memory and on hard disk) to avoid repeated compilations, and this architecture does provide an initial speed improvement.

Why bytecode

With the complexity of the web and the popularity of mobile devices, early architectures have created a lot of problems,

1. Memory usage

At the heart of the problem is space usage. In V8, the js source code is converted to binary code and the binary code is stored in memory. After exiting the process, the binary code is stored on hard disk. The memory space occupied by converting JS source code into binary code is very huge. If the size of a JS source file is 1M, the generated binary code may be more than ten M. However, the memory of early mobile phones is generally not high, and excessive occupation will lead to greatly reduced performance.

2. Code complexity is too high

As mentioned above, different CPU architectures had completely different instruction sets and there were many different TYPES of CPU architectures on the market, so the full-CodeGen engine that converted the AST to binary and the optimized Crankshaft engine had to code for different CPU architectures. This complexity and workload can be imagined, but bytecode compilation can greatly reduce the workload, as shown in the following figure:

3. A Bug

The reporter of the bug reloaded Facebook in the Chrome browser at the time and opened various monitoring findings: V8.Com pileScript takes 165 ms to load for the first time, while reloading shows that the really time-consuming JS code is not cached, resulting in roughly the same amount of compile time on reloading as the first time.

V8 does not compile all binary code, but only the outer layer of code. Code inside a function is compiled on the first call. For example:

If the browser only caches the outermost layer of code, then our highly engineered modules in the front end will not be able to cache the critical code inside, which is the main cause of the above bug.

V8’s existing architecture

To address these issues, V8 started with an architecture that introduced bytecodes, culminating in the architecture shown below:

Blog.itpub.net/69912579/vi…


  • The first step is to convert the JS source code into an AST (Abstract syntax tree)
  • The second step is to compile the AST into bytecode using the Ignition interpreter and start interpreting the bytecode into binary code and executing it line by line.
  • The third step, in the process of explaining execution, marks the hot code that is repeatedly executed, compiles the marked code through the Turbofan engine to generate more efficient binaries, and runs the function again to execute only the efficient code without explaining the execution of the bytecode.

V8’s introduction of the bytecode architecture pattern clearly solves the following problems:

  • Long startup time: it only needs to compile bytecode at startup, and then execute bytecode sentence by sentence, which can compile bytecode much faster than binary code.
  • Large memory footprint: The space footprint of bytecode is also much lower than that of binary code.
  • Too much code complexity: Greatly reduces the code complexity required for V8 to accommodate different cpus.

Finally, let’s take a look at the effect of the new architecture compared with the old one:

  • Memory footprint

    (Photo credit:Blog.itpub.net/69912579/vi…

  • Page speed

conclusion

This paper mainly through the bytecode to understand the characteristics of the V8 engine architecture evolution, analyses the reasons of the evolution of architecture and a comb behind to help us continue to understand the V8, if there are any errors, please review and discuss, the author if you feel this article helpful to you, please help to point a praise, thank you very much.

Refer to the article

Blog.itpub.net/69912579/vi… Time.geekbang.org/column/arti…

series

V8 Engine Details (1) — Overview V8 engine details (2) — AST V8 engine details (3) — Bytecode evolution V8 engine details (4) — Bytecode execution V8 engine details (5) — Inline caching V8 engine details (6) — Memory structure V8 engine details (7) – Garbage collection mechanism V8 engine details (8) – message queue V8 engine details (9) – coroutines & generator functions