JavaScript engine
A JavaScript engine is a virtual machine that specializes in JavaScript scripting and is typically shipped with a web browser.
We can simply think of a JavaScript virtual machine as a translator that translates JavaScript, a programming language that humans understand, into a machine language that machines understand.
There are many types of JavaScript engines:
- Rhino – Managed by the Mozilla Foundation, open source and developed entirely in Java.
- SpiderMonkey – The first JavaScript engine, originally used in Netscape Navigator and now in Mozilla Firefox.
- V8 – Open source, developed by Google Denmark, as part of Google Chrome
- JavaScriptCore – Develops source code for Safari.
- KJS – Engine for KDE, originally developed by Harri Porten for the Konqueror web browser in the KDE project.
- Chakra (JScript Engine) – For Internet Explorer.
- Chakra (JavaScript Engine) – For Microsoft Edge.
- KJS — KDE’s ECMAScript/JavaScript engine, originally developed by Harry Burton, used in the KDE project’s Konqueror web browser.
The V8 engine is currently the most widely used JavaScript virtual machine, and is also the engine used by Chrome and NodeJS.
Two, V8 engine internal principle
V8 Workflow
JavaScript is a high-level language, and computers can only understand ones and zeros. There are generally two ways to execute code written by a high-level language.
The first is interpreted execution, which requires the input source code to be compiled into intermediate code by the parser, and then the intermediate code is interpreted and executed by the interpreter directly, and then the result is directly output. This method is fast to start but slow to execute.
The second is compilation execution, which involves compiling the input source code into intermediate code through a parser, and then using a compiler to compile the intermediate code into machine code, which is usually stored in binary files, and then executing the binary directly. This method is slow to start and fast to execute.
Instead of a single technique, the V8 engine uses a JUST-in-time (JIT) technique, which is a mixture of compile and interpret execution. The JIT technique is also a trade-off strategy, using an interpretive execution strategy during startup, but if a piece of code is executed more frequently than one value, V8 uses an optimized compiler to compile it into machine code that executes more efficiently.
V8 engine workflow:
- V8 starts with initialization of the execution environment, such as stack, global execution context, built-in functions, and so on.
- V8 receives the input source code, structures it, and generates an abstract syntax tree, known as the AST.
- While generating the AST, V8 also generates the relevant scope, which then generates bytecode, which allows the interpreter to interpret execution directly.
- When the interpreter executes the bytecode and finds that the code has been executed multiple times, it is thrown to the compiler, which compiles the bytecode into binary code for execution.
How does V8 find variables
Scope chains chain scopes to achieve a variable lookup path. When a variable is used inside a function, V8 looks it up in scope, and if the variable is not found in the current function scope, V8 looks it up in global scope.
The global scope is created during V8 startup and remains in memory undestroyed until V8 exits. A function scope is created when the function is executed and destroyed when the function is finished executing.
Note that the internal handling of functions in V8 adds two internal attributes, name and code, because functions are special objects that can be assigned, used as arguments, and called.
How does V8 compute
In JavaScript, the addition of numbers and strings returns a new string. This is because JavaScript thinks it makes sense to add strings and numbers. V8 converts the numbers to characters and then adds the two strings, resulting in a new string.
In JavaScript, the type system is implemented according to the ECMAScript standard, so V8 is strictly implemented according to the ECMAScript standard. V8 uses ToPrimitive to convert the object to a native string or number type during addition. ToPrimitive calls the valueOf method on the object. The toString method is called, and if neither the vauleOf or toString methods return a primitive type value, a TypeError error is raised.
How does V8 implement closures
First, let’s look at lazy parsing. Lazy parsing means that if the parser encountered a function declaration during parsing, it skipped the code inside the function and generated the AST and bytecode for it, but only the AST and bytecode for the top-level code. Lazy parsing can speed up the startup of JavaScript code, which can greatly increase the user’s wait time if all the code is parsed and compiled at once.
Since JavaScript is a language that naturally supports closures, which refer to variables outside the scope of the current function, when V8 parses a function, it also needs to determine whether the function’s internal function refers to a variable declared inside the current function. If it does, it needs to put the variable in the heap. This variable is not released even after the current function has finished executing.
V8 engine garbage collection mechanism
The New generation of Scavenge
- Divide heap memory into two parts, one of which is in use called from space and the other is idle called to space.
- In garbage collection, live objects in the FROM space are checked and copied to the TO space, while space occupied by non-live objects is freed. This phase promotes the object to the old generation space if it has gone through a clean collection or if the to space is already 25% used.
- Copy these live objects into the to space. Space occupied by non-living objects will be freed.
- After the replication is complete, the FROM space is swapped with the TO space.
The scavenge algorithm trades space for time. The problem is that it uses only half of the heap memory, but because it replicates only viable objects and only a small number of viable objects for short life cycles, it performs exceptionally well on time efficiency.
Mark-sweep old generation Mark Sweep
- The mark phase walks through all the objects in the heap and marks the living objects.
- In the clear phase, only unmarked objects are cleared.
The problem of Mark-sweep algorithm is that after a clear Mark, the memory will be in a discontinuous state, which will cause that when a large object needs to be allocated later, the allocation cannot be completed, and garbage collection will be triggered in advance, which is unnecessary.
Mark-compact old generation Mark finishing
- Evolution based on Mark-sweep algorithm.
- To move a living object toward one end during tag collation.
- After the move is complete, the memory outside the boundary is cleaned up directly.
The problem with the Mark-Compact algorithm is that when the object survival rate is high, more replication operations are required, which becomes less efficient. More critically, if you don’t want to waste 50% of your space, you need to have extra space to guarantee allocation in the extreme case where all objects in used memory are 100% alive.