WebAssembly Series (4) How WebAssembly works

Big beard

Translation: the original huziketang.com/blog/posts/… 英文原文 : Creating and working with WebAssembly modules

Please indicate the source, keep the original link and author information

This is the fourth article in the WebAssembly series (There are six articles in this series). If you haven’t read the previous article, suggestRead here. If you have no idea about WebAssembly, suggestRead here first.

WebAssembly is another programming language besides JavaScript that can run in web pages. In the past, if you wanted to run code in a browser to control various elements of a web page, JavaScript was the only option.

So when people talk about WebAssembly, they tend to compare it to JavaScript. But it’s not an “either or” relationship — it’s not just WebAssembly or JavaScript.

In fact, we encourage developers to use both languages together, and even if you don’t implement WebAssembly modules yourself, you can learn from its existing modules and use its strengths to implement your features.

The WebAssembly module defines some functions that can be invoked through JavaScript. So just like you can download the LoDash module through NPM and use it through the API, you can download the WebAssembly module and use the functionality it provides in the future.

So let’s take a look at how to develop WebAssembly modules and how to use them through JavaScript.

Where does WebAssembly fit in?

In my last article on assembly, I showed how compilers translate from high-level languages to machine code.

So where is WebAssembly in the figure above? In fact, you can think of it as another “target assembly language.”

Each of the target assembly languages (x86, ARM) relies on a specific machine structure. When you want to execute your code on a user’s machine, you don’t know what the target machine structure looks like.

Unlike other assembly languages, WebAssembly does not depend on a physical machine. It can be understood abstractly as the machine language of conceptual machines, rather than the machine language of actual physical machines.

For this reason, WebAssembly directives are sometimes referred to as virtual directives. It maps more directly to machine code than JavaScript code, and it represents an idea of how code can be executed more efficiently on general-purpose hardware. So it doesn’t map directly to the machine code of a particular hardware.

The browser downloads WebAssembly, then passes through the WebAssembly module to the assembly code on the target machine.

Compile to the.wasm file

The compiler toolchain currently best supported for WebAssembly is LLVM. There are a number of different front-end and back-end plug-ins available for LLVM.

Tip: Many WebAssembly developers develop in C or Rust and then compile into WebAssembly. There are other ways to develop WebAssembly modules. For example,Develop WebAssembly modules with TypeScriptOr just use itWebAssembly textCan also.

Imagine moving from C to WebAssembly, we need a Clang front end to turn C code into LLVM intermediate code. When changed to LLVM IR, LLVM understands the code and automatically optimizes it.

To generate WebAssembly from LLVM IR, you also need a back-end compiler. There is a back end under development in the LLVM project and it should be ready soon, but at this point in time, it is not clear how it will work.

There is also an easy-to-use tool called Emscripten. It converts the code to its own intermediate code (called ASM.js) through its own back end, and then to WebAssembly. In fact, LLVM is also used behind it.

Emscripten also includes many additional tools and libraries to contain the entire C/C++ code base, so it is more like a software developer’s toolkit (SDK) than a compiler. For example, if a system developer needs a file system to read and write files, Emscripten has an IndexedDB to simulate the file system.

Don’t worry too much about these toolchains, just know that the.wasm file is generated as a result. I’ll look at the structure of the.wasm file later, but before I do that, let’s look at how to use it in JS.

Load a.wasm module into JavaScript

The. Wasm file is a WebAssembly module that can be loaded into JavaScript for use, and the loading process is a bit complicated at this stage.

function fetchAndInstantiate(url, importObject) {
  return fetch(url).then(response= >
    response.arrayBuffer()
  ).then(bytes= >
    WebAssembly.instantiate(bytes, importObject)
  ).then(results= >
    results.instance
  );
}Copy the code

You can learn more about this in the MDN documentation.

We’ve been working on making this process easier, optimizing the toolchain. You want to be able to integrate it into an existing module packaging tool, such as WebPack, or into a loader, such as SystemJS. We believe that loading a WebAssembly module can be as simple as loading JavaScript.

Here are the main differences between WebAssembly modules and JavaScript modules. Current WebAssemblies can only use numbers (integer or floating point) as arguments or return values.

For any other complex type, such as string, you must operate with the memory of the WebAssembly module. For those of you who use JavaScript a lot and aren’t familiar with manipulating memory directly, think back to C, C++, and Rust, which all manipulate memory manually. WebAssembly’s memory operations are similar to those of these languages.

To do this, it uses a data structure in JavaScript called an ArrayBuffer. An ArrayBuffer is an array of bytes, so its index acts as a memory address.

If you want to pass a string between JavaScript and WebAssembly, you can write it to memory using an ArrayBuffer, whose index is an integer, and pass it to the WebAssembly function. In this case, the index of the first character can be used as a pointer.

It is as if a Web developer were developing a WebAssembly module with a wrapper around it. This way other users don’t have to worry about the details of memory management when using the module.

If you want to learn more about memory management, take a look at our WebAssembly memory operations.

Wasm file structure

If you are a developer writing a high-level language and are compiled into WebAssembly by a compiler, you don’t need to care about the structure of WebAssembly modules. But understanding its structure helps you understand some basic questions.

If you are not familiar with compilers, it is recommended that you read the article “How compilers generate assembly”.

This code is the C code that will generate the WebAssembly:

int add42(int num) {
    return num + 42;
}Copy the code

You can compile this function using the WASM Explorer.

Open the.wasm file (assuming your editor supports it) and see the following code:

00 61 73 6D 0D 00 00 00 01 86 80 80 80 00 01 60
01 7F 01 7F 03 82 80 80 80 00 01 00 04 84 80 80
80 00 01 70 00 00 05 83 80 80 80 00 01 00 01 06
81 80 80 80 00 00 07 96 80 80 80 00 02 06 6D 65
6D 6F 72 79 02 00 09 5F 5A 35 61 64 64 34 32 69
00 00 0A 8D 80 80 80 00 01 87 80 80 80 00 00 20
00 41 2A 6A 0BCopy the code

This is the “binary” representation of the module. “Binary” is quoted because it’s actually written in hexadecimal, but it’s easy to turn it into binary or human-readable decimal notation.

For example, here are the various representations of num + 42.

How does the code work: stack-based virtual Machines

If you’re curious about how it works, this picture shows you what the instructions do.

Notice from the figure that the addition operation does not specify which two numbers to add. This is because WebAssembly is a “stack-based virtual machine” mechanism. That is, all the values required by an operator are stored on the stack before the operation takes place.

All operators, such as addition, know how many values they need. The addition takes two values, so it just takes two values from the top of the stack. Add instructions can be made shorter (single byte) because instructions do not need to specify source registers and destination registers. This also makes.wasm files smaller, which in turn makes loading.wasm files faster.

Although WebAssembly uses stack-based virtual machines, that’s not how it works on real physical machines. When the browser translates WebAssembly to machine code, the browser uses registers, and the WebAssembly code does not specify which registers to use. This has the advantage of giving the browser maximum freedom to make the best register allocation itself.

Components of the WebAssembly module

In addition to the above,.wASM files have other parts, often referred to as parts. Some components are required for modules, and some are optional.

Required parts:

The Type. Function declarations for functions defined in a module and all function declarations introduced into the function.
The Function. Gives an index for each function in the module.
Code. The actual function body of each function in a module.

Optional parts:

Export. Make functions, memory, tables, global variables, and so on visible to other WebAssemblies or JavaScript, allowing dynamic linking of some separately compiled components, namely WebAssembly versions of.dll.
Import. Allows importing specified functions, memory, forms, or global variables from other WebAssemblies or JavaScript.
Start. A function that runs automatically when a WebAssembly module is loaded (similar to the main function).
Global. Declare global variables for a module.
The Memory. Define the memory used by the module.
The Table. Allows mapping to values outside the WebAssembly module, such as to JavaScript objects. This is useful for indirect function calls.
The Data. Initialize memory.
Element. Initialize the table.

If you want to learn more about the parts, you can learn more about them in how to Use Parts.

The following warning

Now that you know how the WebAssembly module works, here’s why WebAssembly runs faster.

I’m currently working on a little book called React.js.

WebAssembly Series (4) How WebAssembly works

Where does WebAssembly fit in?

Compile to the.wasm file

Load a.wasm module into JavaScript

Wasm file structure

How does the code work: stack-based virtual Machines

Components of the WebAssembly module

The following warning

Related Posts

The most “understanding” Promise fulfillment you’ll ever see

Preliminary discussion on taro’s multi-practice

Do you think of IntersectionObserver when you want to implement lazy image loading?