preface
This article is the fifth part of V8 engine detailed series, focusing on the inline cache of V8 engine. V8 can run efficiently, its internal implementation of many optimization strategies, among which inline cache is a very important optimization strategy. This article explores Inline Cache, or IC, starting with a quick question. There will be links to the finished series at the end of this article, which is still being updated.
Let’s start with a question
Let’s start with a little example
let length = 10000;
let obj0 = {x: 1, y: 2, z: 3};
function func(o) {
for(let i ino) { o[i].toString(); }}Copy the code
console.time('t0'); // Start the timerfor (let i = 0; i < length; i++) {
letobj1 = {x: 3, y: 2}; Obj1 [I] = 3; func(obj0); } console.timeEnd('t0'); // The timer endsCopy the code
Let’s take a look at the result T0:8.047119140625ms
The only difference is that func calls obj1 as follows:
console.time('t1'); // Start the timerfor (let i = 0; i < length; i++) {
let obj1 = {x: 3, y: 2};
obj1[i] = 3;
func(obj1);
}
console.timeEnd('t1'); // The timer endsCopy the code
Let’s look at the result t1: 14.747314453125ms
We can see the difference in elapsed time, and the main reason for this difference is the mechanism of inline caching.
Inline cache
What is inline caching
Inline caching (later called IC) is not a V8 invention, and the technology is very old, having been used on Smalltalk virtual machines. The principle of IC is simply to collect some data information during operation, cache this part of information, and then directly use this information when executing again, effectively saving the consumption of obtaining these information again, thus improving performance.
For example: when we use an object obj = {x: 1, y: 2}, if we call obj. X we will cache obj. X, when we call obj. X we will use the cached information directly, without retrieving the value of obj.
How does inline caching work
We can see how inline caching works by analyzing a piece of bytecode execution. For those of you who are not familiar with bytecode execution, see V8 engine in detail (4) : How bytecode execution takes place. Let’s start with the following code
function test(obj) {
obj.y = 4;
obj.x += 2;
return obj.x;
}
test({x: 1, y: 2});
Copy the code
Function into bytecode looks like this:
-
Enter the function’s advanced stack check, and the small number 4 will be stored in the accumulator.
-
Pass the value of the accumulator to A0 [0] (obj.y) and cache the information of ** A0 [0] (obj.y)** into slot 0 in the feedback vector table.
-
Load the value of A0 [1] (obj.x) into the accumulator and cache information for ** A0 [1] (obj.x)** into slot 2 in the feedback vector table.
-
Increment the value in the accumulator by 2, cache the resulting value into slot 4 in the feedback vector table, then assign the value in the accumulator to A0 [1] (obj.x), and cache the information into slot 5 in the feedback vector table.
-
Finally, when we fetch the value from obj. X directly from the cache into the accumulator and return the value from the accumulator.
The running process is not complicated, but it is essentially a matter of marking up some call points and allocating a slot for them to be cached, and fetching values directly from the cache when called again.
Inline cache singlet versus polymorphic
In fact, when we call a function, we can improve the efficiency of the function by caching information, but the premise is that the structure of the parameters is fixed, so what if the structure of the parameters is not fixed?
This brings us back to the problem we started with, but back to the code:
console.time('t0'); // Start the timerfor (let i = 0; i < length; i++) {
letobj1 = {x: 3, y: 2}; Obj1 [I] = 3; func(obj0); } console.timeEnd('t0'); // The timer endsCopy the code
Func in this code calls the fixed structure obj0 = {x: 1, y: 2, y: 3}, so it can be executed much more efficiently with inline cache acceleration. But in the second code:
console.time('t1'); // Start the timerfor (let i = 0; i < length; i++) {
let obj1 = {x: 3, y: 2};
obj1[i] = 3;
func(obj1);
}
console.timeEnd('t1'); // The timer endsCopy the code
The structure of obj2 for func calls changes every time it is executed (the value of I keeps changing), so how does V8 handle this
In this context, Polymorphic Inline Cache is also known as PIC. PIC refers to the caching of not only one piece of data in the same Slot position, as shown in the figure below:
Illustration of V8 engine
When the function is executed for the first time, V8 records some information about the object in the slot. When the function is executed for the second time, v8 compares the information recorded for the first time with the information recorded for the second time. If they are the same, v8 records the information in the same place. By analogy, a slot records more than one piece of information (of course, there is a limited number). A PIC polymorphic inline cache can be used to record multiple entries in the same slot. A polymorphic inline cache may perform multiple comparisons and is not as efficient as a singleton inline cache.
conclusion
This article focuses on inline caching in the V8 engine, but also explains the use of the extra value (the feedback vector) at the end that is often seen when analyzing bytecode. In fact, in-line caching polymorphisms are unavoidable in our development process, and V8 has made a lot of optimizations for this situation. In most cases, the difference is completely invisible, so we can understand it when writing the code. There is no need to optimize for this. If there are any mistakes, please discuss them with the author in the comments. If you found this article helpful, please give it a thumbs up.
Refer to the article
Time.geekbang.org/column/arti…
series
V8 Engine Details (1) — Overview V8 engine details (2) — AST V8 engine details (3) — Bytecode evolution V8 engine details (4) — Bytecode execution V8 engine details (5) — Inline caching V8 engine details (6) — Memory structure V8 engine details (7) – Garbage collection mechanism V8 engine details (8) – message queue V8 engine details (9) – coroutines & generator functions