On October 28, 2021, Google Chrome fixed the vulnerability CVE-2021-38001 submitted by Kunlun Lab on Tianfu Cup in release 95.0.4638.69. Because the PoC of this vulnerability is very concise, the author has a strong interest in V8 engine, and the analysis of this vulnerability is also a study of V8. V8 is Google’s JavaScript and WebAssembly engine written in C++ and used in both Chrome and node.js.

Inline cache

The vulnerability relates to Inline Caching, an optimization technique used in the Runtime environment. This optimization is important for dynamic languages because they must perform method binding at runtime. Here’s an example:

def foo(a,b):
  a.func(b)
Copy the code

The Python bytecode for this code looks like this:

Disassembly of <code object foo at 0x000000000356EEA0, file "<dis>", line 2>:
  3           0 LOAD_FAST                0 (a)
              2 LOAD_METHOD              0 (func)
              4 LOAD_FAST                1 (b)
              6 CALL_METHOD              1
              8 POP_TOP
             10 LOAD_CONST               0 (None)
             12 RETURN_VALUE
Copy the code

At execution time, LOAD_METHOD verifies the type of A and looks for add using that type.

If there is no IC, the same thing must be done (in the same context) the second time you execute a.func(b). This is logically rigorous, but inefficient, so is there any way to speed things up? In fact, when programmers write code, it will most likely be written in the following form:

def foo(a,...) : a.func(b) a.func(c) ... a.func(z)Copy the code

In the code above, the type of A is constant. Deutsch and Schiffman in their article (web.cs.ucla.edu/~palsberg/c… ‘At some point in code execution, the type of receiver is usually the same as at the last point in time’. For example, in the example above, the type of A has not changed, so the type of A can be cached for later use.

V8 uses data-driven ics, which encode the loaded storage information of attributes into Data structures. Other functions (such as LoadIC and StoreIC) read this structure and perform operations accordingly. Here are the differences between the pre-V8 Patching IC and the data-driven IC today.

The FeedbackVector on the right here is a data structure that records and manages all execution feedback and is critical to improving JavaScript execution efficiency. Meanwhile, fast-path, slow-path and Miss can be found in the figure. Miss is well understood as requiring runtime validation of the type. So what does slow-path and fast-path correspond to? It can be understood through the following examples:

let a = {foo:3}
let b = {foo:4}
Copy the code

Here a and B have the same architecture, so there is no need to create a different architecture for the two objects. V8 does this by dividing the schema and value of an Object into Object Shapes, called Maps in V8, and a vector of values. In the example above, V8 will first create a shape Map[a]. This shape has an attribute foo at offset 0, with a value of 3 in the corresponding vector[0]. To create object B, simply point b’s Map to Map[a] and set the corresponding vector[0]=4. This is the fast-path.

Let’s say I have

a.foo1 = 4
Copy the code

V8 then creates a new Map[a1] and changes A’s Maps to Map[a1]. Map[A1] has property foo1 at offset 1 pointing to Map[a] and sets the corresponding vector[1] to 4. Is the missile – path.

The following example will contain all three cases:

function load(a) {
  return a.key;
}
//IC of load: [{ slot: 0, icType: LOAD, value: UNINIT }]
let first = { key: 'first' } // shape A
let fast = { key: 'fast' }   // the same shape A
let slow = { foo: 'slow' }   // new shape B

load(first) //IC of load: [{ slot: 0, icType: LOAD, value: MONO(A) }] --> Miss
load(fast) //IC of load: [{ slot: 0, icType: LOAD, value: MONO(A) }]  --> Fast
load(slow) //IC of load: [{ slot: 0, icType: LOAD, value: POLY[A,B] }]  --> Slow. Now it needs to check 2 shapes. 
Copy the code

Holes cause

The vulnerability of repair modified the two functions HandleLoadICSmiHandlerLoadNamedCase and ComputeHandler Commit. Tracing the two functions reveals the following call chain:

ComputeHandler
              ^
UpdateCaches
              ^
Load
              ^
Runetime_LoadWithReceiverIC_Miss
Copy the code

and

HandleLoadICSmiHandlerLoadNamedCase 
              ^
HandleLoadICSmiHandlerCase
              ^
HandleLoadICHandlerCase
              ^
GenericPropertyLoad
Copy the code

As you can see from the function name, you are loading properties, which is reminiscent of the problem of loading properties discussed in the IC tutorial. By looking at bytecode, you can see that property loading is done through LdaNamedProperty. A search found the following code:

// LdaNamedProperty <object> <name_index> <slot> // // Calls the LoadIC at FeedBackVector slot <slot> for <object> and the name at // constant pool entry <name_index>. IGNITION_HANDLER(LdaNamedProperty, InterpreterAssembler) { TNode<HeapObject> feedback_vector = LoadFeedbackVector(); // Load receiver. TNode<Object> recv = LoadRegisterAtOperandIndex(0); // Load the name and context lazily. LazyNode<TaggedIndex> lazy_slot = [=] { return BytecodeOperandIdxTaggedIndex(2); }; LazyNode<Name> lazy_name = [=] { return CAST(LoadConstantPoolEntryAtOperandIndex(1)); }; LazyNode<Context> lazy_context = [=] { return GetContext(); }; Label done(this); TVARIABLE(Object, var_result); ExitPoint exit_point(this, &done, &var_result); AccessorAssembler::LazyLoadICParameters params(lazy_context, recv, lazy_name, lazy_slot, feedback_vector); AccessorAssembler accessor_asm(state()); accessor_asm.LoadIC_BytecodeHandler(&params, &exit_point); . }Copy the code

Note the last line, where tracing LoadIC_BytecodeHandler shows that this function handles all cases of attribute access. The FeedbackVector will not be accessed for the first time, so the LoadIC_NoFeedBack function will be entered.

void AccessorAssembler::LoadIC_NoFeedback(const LoadICParameters* p,
                                          TNode<Smi> ic_kind) {
  Label miss(this, Label::kDeferred);
  TNode<Object> lookup_start_object = p->receiver_and_lookup_start_object();
  GotoIf(TaggedIsSmi(lookup_start_object), &miss);
  TNode<Map> lookup_start_object_map = LoadMap(CAST(lookup_start_object));
  GotoIf(IsDeprecatedMap(lookup_start_object_map), &miss);

  TNode<Uint16T> instance_type = LoadMapInstanceType(lookup_start_object_map);

  {
    // Special case for Function.prototype load, because it's very common
    // for ICs that are only executed once (MyFunc.prototype.foo = ...).
    Label not_function_prototype(this, Label::kDeferred);
    GotoIfNot(IsJSFunctionInstanceType(instance_type), &not_function_prototype);
    GotoIfNot(IsPrototypeString(p->name()), &not_function_prototype);

    GotoIfPrototypeRequiresRuntimeLookup(CAST(lookup_start_object),
                                         lookup_start_object_map,
                                         &not_function_prototype);
    Return(LoadJSFunctionPrototype(CAST(lookup_start_object), &miss));
    BIND(&not_function_prototype);
  }

  GenericPropertyLoad(CAST(lookup_start_object), lookup_start_object_map,
                      instance_type, p, &miss, kDontUseStubCache);

  BIND(&miss);
  {
    TailCallRuntime(Runtime::kLoadNoFeedbackIC_Miss, p->context(),
                    p->receiver(), p->name(), ic_kind);
  }
}
Copy the code

GenericPropertyLoad is found here. Also found that no matter how will perform Runtime: : kLoadNoFeedbackIC_Miss. This function is actually RUNTIME_FUNCTION(Runtime_LoadWithReceiverIC_Miss).

At this point, the complete call chain has been found. Based on this chain, you can see that LoadIC_NoFeedback is called the first time a property is accessed because there is no FeedbackVector. Assuming lookup_start_object is not a small integer and is not deprecated (recycled), GenericPropertyLoad is called, followed by LoadNoFeedbackIC_Miss. In ComputeHandler, found that modify the judgment of the branch to check if the holder in IsJSModuleNamespace, but it is loaded in HandleLoadICSmiHandlerLoadNamedCase receiver, This object is not in JSModuleNamespace. So when the FeedbackVector is created, the internal IC type record may not be the same as the actual type when it is called. Assuming that some properties in JSModuleNamespace are called using the object type stored in the IC, V8 will use the type stored in the IC according to FastPath. But because receiver is not of this type, type confusion can result.

In summary, the reoccurrence of this vulnerability requires the following conditions:

  1. inJSModuleNamespaceTo place a property/function that can be called at any time

This can be done using a function or attribute in the export file, such as in a file:

Export let foo = {} // export function foo() {.... }Copy the code

In “Another file” :

Import * as foo from "file.mjs "; %DebugPrint(foo)Copy the code

The following results will occur:

/*
DebugPrint: 000003BF080496D9: [JSModuleNamespace]
 - map: 0x03bf082077f9 <Map(DICTIONARY_ELEMENTS)> [DictionaryProperties]
 - prototype: 0x03bf08002235 <null>
 - elements: 0x03bf08003295 <NumberDictionary[7]> [DICTIONARY_ELEMENTS]
 - module: 0x03bf081d3229 <Other heap object (SOURCE_TEXT_MODULE_TYPE)>
 - properties: 0x03bf080496ed <NameDictionary[17]>
 - All own properties (excluding elements): {
   0x03bf08005669 <Symbol: Symbol.toStringTag>: 0x03bf080049f5 <String[6]: #Module> (data, dict_index: 1, attrs: [___])
   f: 0x03bf081d3349 <AccessorInfo> (accessor, dict_index: 2, attrs: [WE_])
 }
 - elements: 0x03bf08003295 <NumberDictionary[7]> {
   - max_number_key: 0
 }
000003BF082077F9: [Map]
 - type: JS_MODULE_NAMESPACE_TYPE
 - instance size: 16
 - inobject properties: 0
 - elements kind: DICTIONARY_ELEMENTS
 - unused property fields: 0
 - enum length: invalid
 - dictionary_map
 - may_have_interesting_symbols
 - non-extensible
 - prototype_map
 - prototype info: 0x03bf081d3369 <PrototypeInfo>
 - prototype_validity cell: 0x03bf08142405 <Cell value= 1>
 - instance descriptors (own) #0: 0x03bf080021c1 <Other heap object (STRONG_DESCRIPTOR_ARRAY_TYPE)>
 - prototype: 0x03bf08002235 <null>
 - constructor: 0x03bf081c3bed <JSFunction Object (sfi = 000003BF08144745)>
 - dependent code: 0x03bf080021b9 <Other heap object (WEAK_FIXED_ARRAY_TYPE)>
 - construction counter: 0
 */
Copy the code
  1. Creates an object in another file and makes the object’slookup_start_objectandholderDiffer and conform toholderforJSModuleNamespaceType. After that, the function in “a file” (access properties) is called to triggerLoadWithReceiverIC_MissAs a result ofUpdateCaches.
Import * as foo from "file.mjs "; class Test(){} class Test1(){} let tmp = new Test(); Test.prototype.__proto__=Test1; // Modify lookup_start_object test.prototype.__proto__ =foo; / / modify the holderCopy the code
  1. Repeat the above steps until the IC starts using the information in the FeedbackVector. Due to the
  TNode<Module> module =
        LoadObjectField<Module>(CAST(p->receiver()), JSModuleNamespace::kModuleOffset);
Copy the code

It is assumed that a foo is provided, but p->receiver is not foo. Type obfuscation is triggered at this point.

Repair plan

To fix this vulnerability, make sure that the object type used in ic.cc and Accessor-assembler. Cc is the same. Chosen by the V8 for use in HandleLoadICSmiHandlerLoadNamedCase holder (not receiver) as the Load parameters. And for smi category in the ComputeHandler separate branch open a judgment, to ensure that in the process of HandleLoadICSmiHandlerLoadNmaedCase will get smi_handler.

Fix the Commit content as follows:

Recently, Google has officially fixed a number of high-risk vulnerabilities, including those disclosed in The Tianfucup and found to be exploited by the opposition. It is recommended that Chrome users actively upgrade to the latest stable version to avoid being attacked. The latest stable version is 96.0.4664.77.

Refer to the link

Github.com/v8/v8/commi… Github.com/maldiohead/… Web.cs.ucla.edu/~palsberg/c… www.cnnvd.org.cn/web/xxk/ldx…

For more information about Moyun Technology, please pay attention to the “Moyun Security” public account