Meet you beanskins again. This is a great article. I’ve asked for a piece from Bytedance’s “The Little Blacksmith of Milan” on WebAssembly theory, and a follow-up at 😘.

Author: a young blacksmith in Milan

Source: Original

1. Warm up before reading

WebAssembly(hereinafter referred to as WASM) is a seemingly new technology but not “new”. As early as April 2015, WebAssembly community has been established. In the same year, React Native launched by FB and Weex of Ali have achieved visible results in cross-end framework technology. Unlike them, WASM, a W3C standard that didn’t get its MVP(minimum usable) version implemented until 2017 by the likes of Firefox and Chrome, is a relatively young and imaginative new technology.

Here’s the official explanation from MDN: WebAssembly is a new way of coding that can run in modern Web browsers – it is a low-level assembler like language with a compact binary format that runs close to native performance and provides a compilation target for languages such as C/C ++ so that they can run on the Web. It is also designed to coexist with JavaScript, allowing the two to work together.

Docker founder Solomon Hykes once said:

If WASM and WASI had existed back in 2008, we wouldn’t have needed to create Docker. You can see how important Wasm is. WebAssembly on servers will be the future model of “computing”. The problem now is the lack of a standardized system interface. I hope WASI can do the job!

1. Will WASM replace JavaScript?

As far as we can see, WASM won’t replace JavaScript, but it can help solve a lot of problems that JavaScript can’t

  • Remove strong dependencies on different browser environments

    Nowadays, modern front-end development relies heavily on WebPack, Babel, and Polyfill, which can be compiled in megabytes. When a business is very complex, it has to be optimized in a variety of ways. In addition, browser security is a big issue.

  • Weak type or strong type

    Even though we have TypeScript, it only validates our types at compile time, and weird results like the + operator and null == undefined are common to JavaScript.

2. About the ASM. Js

Asm.js, as many of you know, is a very exploratory technology that Mozilla introduced in 2013. Asm.js was the first in FireFox to support browser kernel-level optimization and can be considered a precursor to WASM.

Asm.js is a subset of JavaScript, meaning that the generated code is still 100% pure JavaScript.

Asm.js is still compiled, and written C/C++** code is statically compiled to output a specific piece of JavaScript code that the browser engine makes special optimizations to run, for example

JavaScript code generated after compilation:

Note here in the I | 0, meaning is to tell the browser engine, please use 32-bit integer memory space to deal with the I this variable, similarly, also use + I marked as 64 floating point number.

If you have the ASM. Js and other want to quickly understand the content, can reference nguyen other introductory document: www.ruanyifeng.com/blog/2017/0…

3. About the WASM

WASM is actually a bit like our old friend Flash

The existence of Flash has given the Internet great vitality, such as video, games, plug-ins and so on. However, with the passing of time, many of its security vulnerabilities, lack of compatibility and other problems are constantly exposed. Not only the four major browser manufacturers will completely abandon support, but also its later parent Adobe has no longer maintained it.

That being said, is pure JavaScript 100% capable of replacing ActionScript today?

Google’s Chrome V8 team began investigating how to export complex cross-side capabilities to the browser in a more secure and efficient way back in 2011, and they’ve built on that experience as a reliable leader in WASM. In December 2019, WASM was officially announced by the W3C as the fourth language to run natively in browsers.

4. What scenarios apply to WASM?

Here is a direct copy of the shipping tycoon’s summary

Most of these applications are covered below, but before describing them, let’s take a look at the fundamentals of WASM.

Two, operation principle

1. How does WASM work

Generally speaking, a runnable program is divided into two parts, namely, data + instructions. WASM implements the program based on the stack machine model, a common memory management structure that pushes data and operators onto the stack and pops them up for execution. Virtual instruction sets (V-ISA) can be considered a platform-independent set of custom operators (such as the JVM), as opposed to physical instruction sets (ISA) that are strongly dependent on the physical system (such as Intel’s x86-64).

If we compare the browser to the JVM and WASM binary encoding to Java bytecode, is it immediately obvious why WASM can run on the browser?

2. Analyze the WASM file structure

Here’s a quick overview of what the WASM binary contains and what it does.

First, in organizational terms, WASM puts specific functionality or associated code into a specific Section that makes up the program.

  • TypeSection

    Store “type” related content, mainly “function signature”, that is, return value and parameter value.

  • StartSection

    The function that is called first after a module is initialized can be approximated as the main function

  • GlobalSection

    As the name implies, it holds application-specific global variables, which may be application-specific data or process-related

  • CustomSection

    Can be used as an implementation to customize extension functionality

  • ImportSection

    From the host environment, we can import various data into the WASM module. In other words, we can share data or code through this Section

  • ExportSection

    Similarly, if we can import, we can export data and methods

  • FunctionSection

    This stores function types that correspond to TypeSection one by one

  • CodeSection

    This stores the function-specific code, corresponding to the function-section

  • TableSection, ElementSection

    Meta information Table to store the function pointer, the specific contents of the Element is corresponding to the Table, on Table can refer to this article, the concept of zhuanlan.zhihu.com/p/28222049

  • MemorySection, DataSection

    Again, Memory describes the basic usage of Memory, and Data is the actual content

Finally, how do we determine if a binary file is a WASM module?

If you’ve used Base64 to convert image binaries, you’ll notice that it’s based on something like data:img/ JPG; Base64 is marked as a JPG image in Base64 format. WASM follows in a similar fashion, starting with the binary encoding of the asm string followed by the version number.

3. Basic data types

  • Unsigned integer

    WASM supports three types of non-negative integers: uint8, Uint16 and uint32

  • An unsigned integer of variable length

    WASM supports three non-negative integer types of variable length: VARUint1, VARUint7 and VARuint32. The variable length means the number of bits that can be used depends on the size of the data

  • Signed integer of variable length

    As above, negative numbers are allowed, and three types are supported: varint7, varint32, and varint64

  • Floating point Numbers

    Here is the same as JS, using ieEE-754 scheme, single precision of 32 bits

3. Surrounding ecology

WAT makes WASM more readable

In general, WASM binary files are not readable. WebAssembly Text Format(WAT) is another output Format that displays output in a text-like manner. It can be approximated as the assembly language equivalent of binary, or source-map of WASM.

In addition, flat-WAT is the optimized WAT format. We can compare the outputs of the three in a picture:

Now there are mature conversion tools in the community, such as WASM2WAT, WAT2WASM, etc. We can find them ina tool set named WABT(WebAssembly Binary Toolkit).

WASI operating system interface

As Docker’s founders say, WASI has high hopes. Since WASM runs in the browser’s sandbox environment, WASI was designed to solve this problem by moving it out of the browser and working directly with the operating system.

As mentioned earlier, WASM relies on the V-ISA instruction set, which is completely dependent on the operation instructions provided by the host environment, so to run on a real operating system, a “virtual machine” must translate these instructions and mask the differences between specific system calls, i.e., a JVM-like WASM VM. WASI is the system call abstraction layer that defines these abstractions and has two of the most important features:

1. Portability

Obviously, “Compile once, run anywhere” is the minimum requirement

2. Security

The virtual machine is fully responsible for the control of permissions, and all system calls are controlled externally, isolated from the sandbox environment, thus avoiding security issues caused by third-party libraries

So, where can I get one?

Let’s take a look at the Bytecode Consortium, a semi-official organization founded in 2019 to implement WASM, WASI standards, and explore the world of WASM outside the browser.

Review the common problems in application development, third-party codes share all permissions of the main program, such as Socket, File, memory resource access and other permissions, the code dependency graph is more and more complex, more and more security vulnerabilities, I believe that the students who use FastJson to develop Java and USE NPM to develop NodeJs deeply experience.

A large WASM application is usually composed of multiple WASM modules, each of which has its own independent data resources. Therefore, sub-modules cannot tamper with the data of other modules. In addition, the permissions that each module can use are specified by the uppermost caller, so a third party sub-module cannot be called without the awareness of the upper module. This kind of permission management is similar to the way that Android development needs to declare all dependent permissions in advance.

Here are some of the more popular WASM VMS on Github:

  • WASMTIME

    WASMTIME is a WASM VIRTUAL machine promoted by the Bytecode Consortium, which can be used as a CLI or embedded in other applications such as IoT or cloud native

  • WAMR

    The virtual machine, also owned by the Bytecode Consortium and more cha-like, is, as its name suggests, very small, with a starting speed of 100 microseconds and a minimum memory consumption of 100 KILobytes

  • WASMER

    This is the product of a community that is independent of the bytecode Consortium and strives to build its own ecosystem, featuring support for running WASM instances in more programming languages and its own package management platform, Wapm

  • SSVM

    This is a relatively niche runtime, optimized for the cloud, AI, and blockchain

4. Application scenarios

1. How to use a WASM module in a browser?

It is not possible to load a large binary WASM module synchronously in the browser. We must pull the corresponding module asynchronously, but webpack’s import method already supports fast WASM module import. We can choose to push it to the CDN or put it in the Service Worker cache as needed.

At the same time as pulling the WASM module, the browser may have started streaming compilation of the module, compiling the bytecode into platform-specific code, and then instantiating the module and importing the host data needed by WASM. Once instantiated, the module can be invoked via JavaScript.

  • Generating binary data

After we fetch the binary encoding of WASM, we need to convert it into a JavaScript binary array

  • Compilation phase

We can use a step compilation method

Or compile + instantiate all at once

  • instantiation

As you may have noticed in the previous step, you may need to pass in importObject when instantiating. What is this?

The ImportSection can import data from the outside into a module

If we need to instantiate separately, we can call the following method

  • Streaming compilation

In fact, WASM provides streaming methods for both compilation and instantiation

Note that in order to implement streaming compilation, the source needs to be a Promise that has not yet resolved, that is, the FETCH request itself.

  • Error object

Throughout the WASM loading process, we also need to make it clear that different error objects represent different meanings

  • Memory management

One of the biggest features of WASM is security. Within the browser sandbox, the WASM module is completely isolated from the browser’s memory and managed separately. If both require data communication (i.e. ImportObject), there are two cases:

Simple data types imported directly into the module via Webassembly.global

Complex data types that need to be imported into the module’s linear Memory via Webassembly. Memory

  • Cons: No direct access to the DOM

Obviously, if WASM wants to operate on the DOM, it can’t operate directly within a module and must rely on JavaScript methods that are passed in, which is obviously expensive and complex to develop, so WASM is not DOM friendly in that respect.

2. Application in the front end

When it comes to using on the front end, the question is: How much faster is WASM than pure JavaScript, and how should I choose?

As mentioned above, we can’t manipulate the DOM directly in the MVP standard of WASM at this stage. We can even say that most apis require JavaScript API to simulate (i.e. glue code). For example, in C++, we call createDiv, which creates divs. In the glue code, there will be a createDiv: document. The createElement method (‘ div ‘) to perform the mapping, therefore, based on the glue code JavaScript API does not bring obvious efficiency, on the contrary will also bring the expansion of the volume. On the other hand, frequent context switching between WASM and JavaScript apis can be costly, depending on how well the browser vendor optimizes it.

Given the drawbacks, what can WASM do to explore the front end?

  • Rewrite the core logic of the framework

Contemporary popular front-end frameworks, such as React and Vue, have performance bottlenecks that often occur in frequent virtual DOM computations. Since virtual DOM is computation-intensive, is it feasible to use WASM to implement its computational logic? In fact, the React team did think about it, but it took years for a single Fiber, let alone WASM.

  • Create a new framework

We can use Rust to do the same thing as JavaScript. The design concept of Rust is very similar to React, and it is also a radical example of WASM computing for the virtual DOM. Unfortunately, the existence of glue code does not significantly reduce the performance cost.

  • multimedia

The role of WASM in multimedia is very intuitive, we know that audio and video codec is a computational-intensive work, ogV.js is a plug-in developed by wikipedia team to play ogG, WebM, AV1 and other videos online, WXInlinePlayer is more lightweight, consumes less CPU and memory

  • Migrate mature tools

We have seen this a lot, such as Google Earth, Unity3D, CAD and so on, including light games and so on. This is generally carried out by professional teams, and also greatly expands the application prospect of WASM. I am not limited to space to discuss it here.

3. Application in system

Because WASM is itself a lean runtime application, it also offers a lot of interesting exploration in new areas

  • Embedded Devices (IoT)

If a device needs to run a WASM application, all it needs to do is have a virtual machine that can parse and run its bytecode, regardless of the language, and the virtual machine is usually very lightweight, which opens up great possibilities for the expansion of embedded devices. Most are developer – or operation-friendly.

  • The microkernel

Let’s say we use Linux, but we’re really only using 50 percent of its modules, and the other half we might not even use if we break the computer. Similarly, in a system focused on providing a specific application, we do not need all of its functions, a complete integration of graphical interface, multithreading, network, C standard library of the WASM VIRTUAL machine execution layer microkernel is only 468KB, system cold start time and resource utilization is a very eye-catching data.

  • cloud

Krustlet positioning is a Kubernetes Kubelet. Depending on the specified Kubernetes Toleration, the Kubernetes API is able to schedule specific pods to Krustlet and then run them on wasI-based Wasm runtimes.

Embly is a WASM-based Serverless framework that, with a single configuration file, lets you execute WASM bytecode (functions) generated by Rust on your server and access the network and system resources you need to complete your tasks.

Look to the future

As mentioned above, the current development is based on the WASM MVP standard, so what are the new post-MVPs in progress?

  • Multithreading and atomic manipulation

This proposal proposes a new shared memory model that allows linear memory to be used simultaneously by multiple threads, each with its own WASM instance and stack container. Atomic operations are meant to ensure access to shared memory without data contention

  • Single instruction multiple data stream

A single instruction, we can think of as a multiplication operation, while multiple data streams, we can think of as a multiplication operation on multiple numbers at the same time, which is an excellent feature for complex matrix calculations, but no browser currently can implement this

  • A 64 – bit WASM

Currently, all of WASM’s memory operations only use 32-bit offset addresses, so it can access up to 4GB of memory, so scaling to 64-bit is an important requirement

  • WASM modular

Ideally, we could import and export WASM Modules in our code in a similar way to ES6 Modules.

Or use the script tag to visually tell the browser to auto-load and try

Additional proposals can be seen here.

That concludes WebAssembly and thanks for watching

References

Systematic Learning of WebAssembly — Theory

[1] WABT(WebAssembly Binary Toolkit) : github.com/WebAssembly… [2] WASMTIME: github.com/bytecodeall… [3] WAMR: github.com/bytecodeall… [4] WASMER: github.com/wasmerio/wa… [5] SSVM: github.com/second-stat… [6] Yew: github.com/yewstack/ye… [7] ogv. Js: github.com/brion/ogv.j… [8] WXInlinePlayer: github.com/ErosZy/WXIn… [9] Google Earth: [10] Unity3D: blogs.unity3d.com/2018/08/15/… [11] Web.autocad.com/ [12] Krustlet: github.com/deislabs/kr… [13] Embly: www.embly.run/ [14] Here: github.com/WebAssembly

The  End