Image source: unsplash.com/photos/N8yo…

Author: Wu Liuyi

Wasm Interpreter project address:

Github.com/mcuking/was…

background

Since the end of last year, I decided to go deeper into WebAssembly (hereafter referred to as Wasm for ease of writing). While reading the book WebAssembly Principles and Core Technologies (which explains how Wasm interpreters and virtual machines work and how to implement them), The idea of implementing a Wasm interpreter was born into this project. Let’s dive right into how to implement a Wasm interpreter.

Wasm background

Before going into the details of the interpreter implementation process, I’ll give you some background on Wasm.

What is the Wasm

Wasm is a low-level assembler like language that can run on a Web platform at speeds close to native applications. Languages such as C, C++, and Rust use Wasm as the compilation target language, which enables the existing code to be transplanted to the Web platform to improve code reuse.

WebAssembly (Wasm for short) is a binary instruction format based on stacked virtual machines. Wasm is designed to be a portable compilation target for a programming language that can be deployed on a Web platform to serve both client and server applications.

Wasm is defined as a Virtual Instruction Set Architecture v-ISA (Virtual Instruction Set Architecture). For interpretation of this aspect, please refer to the following execution phase.

Let’s look at some of the features of Wasm:

  1. The level must be low, as close to machine language as possible, so that the interpreter can easily compile AOT/JIT and run Wasm programs at a speed similar to native applications.
  2. As object code, generated by other high-level language compilers;
  3. The code is safe and controllable, and can not perform arbitrary operations like real assembly language;
  4. The code must be platform-independent (not platform-dependent machine code) so that it can be executed across platforms, so virtual machine/bytecode techniques are used.

For a more detailed introduction to Wasm, see my translation of WebAssembly’s Post-MVP Future: A Cartoon Skill Tree.

What can Wasm do

Wasm has been applied in browser image processing, audio and video processing, games, IDE, visualization, scientific computing, and non-browser Serverless, blockchain, IoT and other fields. If you want to learn more about Wasm applications, check out my other GitHub repository:

Github.com/mcuking/Awe…

Wasm specification

Wasm technology currently has four specifications:

  • Core specification – defines the semantics of Wasm modules independent of specific embedding (that is, platform-independent).
  • JavaScript API – Defines JavaScript classes and objects used to access Wasm from within JavaScript.
  • Web API – Defines JavaScript API extensions that are specifically available in Web browsers.
  • WASI API – defines a modular system interface to run Wasm outside the Web, such as the ability to access files, network links, and so on.

The Wasm interpreter that this article focuses on runs primarily in a non-browser environment, so there is no need to pay attention to the JavaScript API and Web API specifications.

In addition, WASI is not covered in the current implementation version (there are plans to support it later), so we just need to focus on the core specification for now.

Wasm module

The Wasm module has the following four forms:

  • Binary format — The main encoding format for Wasm, ending with the. Wasm suffix.
  • Text format – mainly for developers to understand Wasm modules, or to write small test code that ends with the.wat suffix, equivalent to assembly language programs.
  • Memory format – How modules are loaded into memory. This representation is specific to the Wasm VM implementation. Different Wasm VM implementations have different memory representations.
  • Module instances – Module instances are equivalent to “objects” if memory formats are understood as classes in an object-oriented language.

The following figure shows the factorial function written in C with the corresponding Wasm text format and binary format.

The memory format depends on the implementation of the Wasm interpreter. For example, the memory format of this project is roughly as follows (more on this later in the execution section) :

The associations between the formats are as follows:

  • Binary formats are generated primarily by high-level programming language compilers, but also by text format compilation.
  • The text format can be written directly by the developer or generated by binary decompilation.
  • The Wasm interpreter typically decodes binary modules into an internal form, that is, an in-memory format (such as C/C++ constructs), and then proceeds.

Finally, I recommend a site called The WebAssembly Code Explorer for a more intuitive look at the association between the Wasm binary format and the text format.

Wasdk. Making. IO/wasmcodeexp…

Interpreter implementation principles

From the above introduction, I believe you have a general understanding of Wasm technology. Let’s start by analyzing the execution flow of Wasm binaries and explore the implementation of the interpreter.

Wasm binaries are executed in three main stages: decoding, validation, and execution

  1. Decoding phase: Decode binary format to memory format.
  2. Validation phase: A static analysis of the module to ensure that the structure of the module meets specification requirements and that the function’s bytecode does not behave badly (such as calling nonexistent functions).
  3. Execution phase: Further divided into instantiation and function call phases.

Tip: The interpreter implemented in this project does not have a separate validation phase. Instead, the specific verification is distributed in the decoding stage or the execution stage, for example, in the decoding stage to verify whether there is an illegal segment ID, in the execution stage to verify whether the type or quantity of the parameter or return value of the function matches the function signature, etc.

In addition, the instantiation process is completed in the decoding phase, and only function calls are required in the execution phase. The so-called instantiation, the main content is to apply for space for memory segment, table segment, etc., record the entry address of all functions (self-defined functions and imported functions), and then record all information of the module into a unified data structure module.

Next, we will elaborate on the implementation details of decoding phase and execution phase respectively.

The decoding stage

Wasm binary file structure

Like other binary formats (such as Java class files), the Wasm binary format starts with a magic number and version number, followed by the body content of the module, which is placed in separate sections for different purposes. Twelve segments are defined, and each segment is assigned an ID (from 0 to 11). Except for custom segments, all segments can appear at most once and must appear in the order of increasing IDS. There are 12 ID segments from 0 to 11 as follows:

Custom segment, type segment, import segment, function segment, table segment, memory segment, global segment, export segment, start segment, element segment, code segment, data segment

The main purpose of this is for stream compilation — that is, compiling the Wasm module to machine code as you download it. For more information, see Making WebAssembly Even Faster: Firefox’s New Streaming and Tiering Compiler

In other words, each different segment describes a piece of information about the Wasm module. All the segments in the module together describe all the information about the Wasm module:

  • Memory segment and data segment: Memory segment is used to store the runtime dynamic data of the program. Data segments are used to store static data that initializes memory. Memory can be imported from an external host, and memory objects can also be exported to an external host environment.
  • Table segments and element segments: Table segments are used to store object references. Currently, objects can only be functions, so you can use table segments to function Pointers. The element segment is used to store the data that initializes the segment. Table objects can be imported from an external host, and table objects can also be exported to an external host environment.
  • Start segment: The start segment is used to store the index of the start function, which specifies a function to run automatically at load time. The start function has the following functions: 1. Initialize the module after loading; 2. 2. Turn modules into executable files.
  • Global segments: Global segments are used to store information about global variables (value types, variability, initialization expressions, and so on).
  • Function section, code section, and type section: These three sections are used to store the data expressing the function. Type segment: The type segment is used to store all function signatures in the module (function signatures record the types and quantities of function parameters and return values). If multiple functions have the same function signatures, only one copy is required. Function segment: The function segment is used to store the corresponding function signature index. Note that the index is the function signature index, not the function index. Code snippets: Code snippets are used to store the bytecode and local variables of a function, that is, the bytecode corresponding to the local variables and code in the function body.
  • Import segment and export segment: The export segment stores information about exported items, such as the member name, type, and index in the corresponding segment. The import section is used to store information about imported items (the member name, type, and module from which the imported items are imported, etc.). There are four types of export/import items: functions, tables, memory, and global variables.
  • Custom section: The custom section is mainly used to store information irrelevant to running, such as debugging symbols.

Table segments are more difficult to understand than Wasm binary segments. In the Wasm design, the code segment/stack and other elements associated with the execution process are completely separated from the memory. This is completely different from the common architecture in which the code segment/data segment/heap/stack are all in the unified addressed memory space. Function addresses are not visible to Wasm programs. Let alone pass, modify, and call functions as if they were variables. Tables are the key to this mechanism. Tables are used to store object references. At present, objects can only be functions, that is, tables are only used to store function index values. Wasm programs can only call functions by finding the corresponding function index value through the index in the table, and the stack data is not stored in memory objects at runtime. This completely eliminates the possibility of Wasm code executing out of bounds and, at worst, creating a bunch of bad data in memory objects.

Knowing the purpose of each segment and the specific encoding format for each segment (see the comments in the load_module function in module.c for details), we can decode the Wasm binary and “translate” it into memory format. That is, all information of the module is recorded into a unified data structure — Module, which is shown in the following figure:

In order to save space and make binary files more compact, Wasm binary format uses LEB128(Little Endian Base 128) to encode integer values such as list length and index. LEB128 is a variable-length encoding format, with 32-bit integers accounting for 1 to 5 bytes and 64-bit integers accounting for 1 to 10 bytes. The smaller the integer, the fewer bytes it takes to encode. Since integers such as list length and index are usually small, LEB128 encoding can be a space saver. LEB128 has two characteristics: 1. It adopts little endian order, that is, the low byte is first and the high byte is last; 2. 2. Use the 128-base system, that is, every 7 bits are in a group (the last 7 bits of a byte). The highest empty bit is the identifier bit. LEB128 has two variants for encoding unsigned and signed integers, which can be found at github.com/mcuking/was… Read_LEB function in.

Finally, part of the actual code corresponding to the decoding stage is shown as follows:

For more details, please refer to github.com/mcuking/was… The load_module function in the

Execution phase

After the decoding phase above, we can get the memory format from the Wasm binary that covers all the information needed for the execution phase. Let’s explore how to implement the execution phase based on the memory format above. Before the formal start, first need to introduce the stack virtual machine knowledge as a foundation.

Wasm is a binary instruction format for stacked virtual machines. That is, Wasm is not only a programming language, but also a set of virtual machine architecture specifications. So what is a virtual machine and what is a stack virtual machine?

Vm Concepts

Virtual machine is software to hardware simulation, with the operating system and compiler to provide functions to simulate the work of hardware, mainly refers to the simulation of hardware CPU. Vm command execution consists of the following three steps:

  1. Retrieves instruction from an address in the instruction stream that the program counter PC points to
  2. Decoding – Determine the type of instruction and enter the corresponding processing flow
  3. Execute – perform the corresponding function according to the instructions

Each instruction in the execution instruction stream repeats the above three steps. The loop needs a flag to record which instruction has been executed, that is, the Program counter PC (Program Count), which stores the address of the next instruction to be executed.

Tip: Instead of platform-specific machine code, the Wasm virtual machine is provided with bytecode composed of a set of instructions customized by Wasm. The main purpose of this platform is to simulate the CPU with software and define a custom instruction set that is similar to the CPU instruction set. In this way, all the applications on the VIRTUAL machine need to be adapted to different platforms, and applications running on the virtual machine need not care which platform they are running on.

Wasm instruction set

Wasm directives fall into five main categories:

  1. Control instructions – function calls/jumps/loops etc
  2. Parameter instruction – discard top of stack etc
  3. Variable directives – Read and write global/local variables
  4. Memory instruction – Memory load/store
  5. Numerical instruction – Numerical computation

Each instruction contains two pieces of information: an opcode and an operand.

  • Opcode: Is the ID of an instruction that determines the operation to be performed by the instruction. It is fixed to 1 byte. Therefore, the instruction set can contain up to 256 instructions, which is also called bytecode. The Wasm specification defines 178 instructions in total. Because the opcode is an integer, which is machine-friendly but not human-friendly, the Wasm specification defines a mnemonic for each opcode.

The following is an enumeration of opcode mnemonics for some Wasm instructions. For the complete version, see github.com/mcuking/was… .

In addition, there is a visual table on GitHub that visually shows all the OPcodes of Wasm. If you are interested, you can check it out.

pengowray.github.io/wasm-ops/

Operands are covered in the stack virtual Machine section below.

Stack virtual machine

There are two kinds of virtual machines: register virtual machines and stack virtual machines.

  • Register virtual machine: According to the hardware CPU implementation ideas, virtual machine also simulated the register, operands and command execution results can be stored in the register. The actual example is the V8 / Lua virtual machine. Because the number of registers is limited, how to allocate infinite variables to a finite number of registers without conflict, need register allocation algorithm, such as the classical graph coloring algorithm. So register virtual machine implementation is slightly more difficult, but the optimization potential is greater.
  • Stack virtual machine: The results of instructions are stored in an emulated Operand Stack, which is simpler to implement than a register virtual machine. A practical example would be JVM/QuickJs/Wasmer.

Let’s take a closer look at how stack virtual machines work.

The operand

The main feature of a stack VM is that it has an operand stack. Most Wasm instructions perform certain operations on the operand stack, such as the following instructions:

F32.sub: Pops two 32-bit floating-point numbers from the operand stack, calculates their difference and pushes the result to the top of the operand stack.

The two 32-bit floating point numbers that pop up from the operand stack are operands. Here is the definition:

Operands, also known as dynamic operands, are the numbers that sit at the top of the operand stack at run time and are manipulated by instructions.

The number immediately

Let’s look at another example of a directive:

Const 3: pushes a local variable of type 32 bit integer with index 3 to the top of the operand stack.

The value 3 is the immediate number. Here is the definition:

Immediate numbers, also known as static immediate parameters/static operands, are hard-coded directly into the instruction (that is, in the bytecode), immediately following the opcodes. Most Wasm directives do not have immediate counts. For details about Wasm directives with immediate counts, see github.com/mcuking/was… The skip_immediate function in.

This is just one instruction. Let’s look at how the next function is executed on a stack virtual machine:

  1. The caller pushes the argument onto the operand stack
  2. After entering the function, initialize the parameters
  3. Execute instructions in the function body
  4. Pushes the result of a function’s execution to the top of the operand stack and returns it
  5. The caller gets the return value of the function from the operand stack

As shown below:

Thus, parameter passing and return value fetching in function calls, as well as instruction execution in the function body, are done through the operand stack.

Call stack and stack frame

As you can see from the above description, function calls are often nested, such as function A calling function B and function B calling function C. Therefore, another Stack is needed to maintain the Call relationship information between functions — the Call Stack.

The call stack is composed of separate stack frames, and each time a function is called, a stack frame is pushed into the call stack. (Note: for brevity, only the function case is discussed. Other control blocks, such as If/Loop, are not covered in this article.) At the end of each function execution, the corresponding stack frame is popped from the call stack and destroyed. A series of function calls that create and destroy stack frames. But at any given moment, only the stack frame at the top of the call stack is active, known as the current stack frame.

Each stack frame contains the following:

  1. The function structure variable associated with the stack frame, used to store all information about the function.
  2. Operand stack, used to store parameters, local variables, and operands during the execution of a function body instruction. It should be reminded that the stack frames associated with all functions share a complete operand stack, and each stack frame will occupy a part of this operand stack. Each stack frame only needs a pointer to store the address of its part of the operand stack to distinguish it from other operand stacks. The advantage of this is that the operand stack portion of the calling function and the stack frame associated with the called function are adjacent in the entire operand stack, making it easier for the calling function to pass arguments to the called function and for the called function to pass return values to the calling function when it completes execution.
  3. Function return address, call instruction is used to store the stack frame of the next instruction address, when the frame from the call stack pop-up, will return to the stack frame call instruction of the next instruction to perform, in other words, the current stack frame after the corresponding function performs the exit, return to call this function to execute instructions.

The interpreter currently defines a stack frame that does not have a local variable table like the JVM stack frame. Instead, it puts parameters, local variables, and operands on the operand stack for two main purposes:

  1. Simple implementation, do not need to define additional local variable table, can greatly simplify the code.
  2. By making parameter passing NOP operand free, the operand stacks of the two frames overlap a part of the data, which is the parameter, so that parameters can be passed between different functions naturally.

The actual sample

After the matting above, I believe we have a certain understanding of stack virtual machine. Finally, we use a practical example to illustrate the entire implementation process:

The following Wasm text format has two functions: compute and add, where the Add function essentially takes two numbers (of type 32-bit integer and 32-bit floating-point) and computes the sum of them. Add is called twice in compute, and the result of the last call to add is already stored on the operand stack. So this time you just need to pass in the second parameter.

(module
    (func $compute(result i32) i32.const 13 ;; Push 13 f32 onto the operand stack const 42.0; Push 42.0 call into the operand stack$add;; call$addFunction returns 55 f32. Const 10.0; Push 10.0 call to operand stack$add;; Call again$addThe function is 65 times func$add(param $a i32) (param $b f32) (result i32)
        i32.get_local $a;; Sets a local variable of type 32 - bit integer$aPush into operand stack f32.get_local$b;; Sets a local variable of type 32 bit floating point$bPress into operand stack i32. trunc_F32_s; The 32-bit floating-point number at the top of the current operand stack$bTruncate to 32 signed bit integer (truncate decimal part) i32.add; Pop the top and subtop 32-bit integers from the operand stack, calculate the sum of the two and push the and onto the operand stack.export "compute" (func $compute(a))export "add" (func $add)))Copy the code

The corresponding schematic diagram of its execution process is as follows:

Finally, some actual code screenshots corresponding to the execution stage are shown as follows:

As you can see, the three phases of the virtual machine — fetch, decode, and execute — can be easily implemented using the while loop and switch-case statements. For more details, please refer to github.com/mcuking/was… The Interpreter function in the Interpreter class, which has rich annotation explanations.

conclusion

This is the core of the Wasm interpreter implementation. Of course, this is only the most basic functionality of the Wasm interpreter — it simply parses and executes instructions one by one. There is no JIT function like other professional interpreters, which interprets and executes bytecode to start quickly. It is then JIT compiled into platform-specific machine code to speed up subsequent code execution (note: the exact implementation of the JIT varies from interpreter to interpreter).

So using this project’s interpreter to interpret and execute Wasm files doesn’t have much of a speed advantage. But the simplicity of the implementation makes the source code easier to read, and it is well annotated, making it ideal for anyone interested in Wasm to quickly learn the core principles of the technology.

It should be noted that this article does not cover how to use Wasm technology. I happen to be developing a video player that supports H256 encoding based on Wasm and FFmpeg. The link to the related article is as follows:

Deep Into WebAssembly video Player Applications

After the video player is put into actual production environment, we will gradually improve the content of this article – focusing on how to better use Wasm technology in front-end projects. Please look forward to it

Github.com/mcuking/blo…

The resources

  • WebAssembly Principles and Core Technologies
  • WebAssembly In Action
  • Introduction to the WebAssembly Standard
  • github.com/kanaka/wac
  • github.com/wasm3/wasm3

This article is published from netease Cloud Music big front end team, the article is prohibited to be reproduced in any form without authorization. Grp.music – Fe (at) Corp.Netease.com We recruit front-end, iOS and Android all year long. If you are ready to change your job and you like cloud music, join us!