WebAssembly status and practice
Why is WebAssembly needed
JavaScript has become the most popular programming language since its inception, and the Web is behind it. Web applications are becoming more and more complex, but this is gradually exposing JavaScript problems:
- The syntax is too flexible to develop large Web projects;
- The performance cannot meet the requirements of some scenarios.
In view of the above two defects, some ALTERNATIVE JS languages have emerged in recent years, such as:
- Microsoft TypeScript improves the looser syntax of JS by adding static type checking to improve code robustness.
- Google’s Dart introduces a new VIRTUAL machine for the browser to run the Dart application directly to improve performance.
- Firefox asm.js is a subset of JS, and the JS engine is optimized for asm.js.
The above attempts have their own advantages and disadvantages, among which:
- TypeScript only solves the problem of loose JS syntax, and it still needs to be compiled into JS to run without improving performance.
- Dart only works in Chrome preview and is not supported by major browsers.
- Asm.js syntax is too simple, very restrictive, and inefficient to develop.
The three browser giants came up with their own incompatible solutions, which went against the purpose of the Web; It is the standardization unification of technology that makes Web come to today, so it is urgent to form a new specification to solve the problems faced by JS.
Thus was born WebAssembly, a new bytecode format already supported by major browsers. Unlike JS interpretation execution, WebAssembly bytecode is similar to the underlying machine code and can be loaded and run quickly, thus greatly improving performance compared to JS interpretation execution. In other words, WebAssembly is not a programming language, but a bytecode standard. It needs to be compiled in a high-level programming language and put into a WebAssembly VIRTUAL machine to run. Browser manufacturers need to implement the virtual machine according to the WebAssembly specification.
WebAssembly principle
To understand how WebAssembly works, you need to understand how a computer works. Electronic computers are composed of electronic components, in order to facilitate the processing of electronic components only exist in two states, corresponding to 0 and 1, that is to say, the computer only knows 0 and 1, data and logic need to be represented by 0 and 1, that is, can be directly loaded into the computer to run the machine code. Machine code is so poorly readable that it is written and recompiled into machine code by high-level languages such as C, C++, Rust, and Go.
Because different COMPUTER CPU architectures are different, machine code standards are also different. Common CPU architectures include x86, AMD64, and ARM. Therefore, the target architecture needs to be specified when the high-level programming language is compiled into self-coding.
WebAssembly bytecode is a type of machine code that is flattened out across different CPU architectures. WebAssembly bytecode cannot run directly on any CPU architecture, but can be translated into the corresponding machine code very quickly because it is so close to the machine code. So WebAssembly runs at close to machine code speed, which sounds a lot like Java bytecode.
WebAssembly has the following advantages over JS:
- Small size: Since the browser only loads compiled bytecodes at runtime, the same logic is much smaller than a JS file described as a string;
- Fast loading: Due to the small size of files and the lack of interpretation, WebAssembly loads and instantiates faster, reducing the wait time before running;
- Fewer compatibility issues: WebAssembly is a very low-level bytecode specification that rarely changes once it’s written, and if it does, it only needs to be compatible during compilation from a high-level language to bytecode. Where compatibility issues can occur is in the JS interface that JS and WebAssembly bridge.
Each high-level language implements the translation of source code to machine code on different platforms. The high-level language only needs to generate the intermediate language (LLVM IR) that the underlying VIRTUAL machine (LLVM) understands. LLVM can implement:
- LLVM IR to different CPU architecture machine code generation;
- Machine code compile-time performance and size optimization.
In addition, LLVM also implements LLVM IR to WebAssembly bytecode compilation, that is, as long as the high-level language can be converted to LLVM IR, it can be compiled into WebAssembly bytecode. Current high-level languages that can compile into WebAssembly bytecode are:
- AssemblyScript: Syntactic with TypeScript, low learning cost for front ends, best choice for front ends to write WebAssembly;
- C ++: the official recommended way, detailed use see the documentation;
- Rust: Syntax is complex and expensive to learn, which can be difficult for the front-end. See the documentation for detailed use;
- Kotlin: The syntax is similar to Java and JS, and the language is cheap to learn. See the documentation for detailed usage.
- Golang: Grammar is easy and cheap to learn. WebAssembly support is not yet available, however, and detailed usage is documented.
The part that usually translates high-level languages into LLVM IR is called the compiler front end, and the part that compiles LLVM IR into the corresponding machine code for each architecture CPU is called the compiler back end. LLVM is now the back-end of choice for more and more high-level programming languages, which only need to focus on providing more efficient syntax while maintaining program execution performance when translating to LLVM IR.
Write WebAssembly
At the beginning of AssemblyScript experience
Next, you’ll see how to use AssemblyScript to write a WebAssembly that computes Fibonacci sequences. The Fibonacci sequence calculation module f.ts in TypeScript is as follows:
export function f(x: i32): i32 {
if (x === 1 || x === 2) {
return 1;
}
return f(x - 1) + f(x - 2)
}Copy the code
After a successful installation, follow the installation tutorial provided by AssemblyScript
asc f.ts -o f.wasmCopy the code
You can compile the above code into a running WebAssembly module.
In order to load and execute the compiled F. asm module, you need to load and call the f function on the module via JS. To do this, you need the following JS code:
Fetch (' f.server ') // Network load f.server file.then (res => res.arrayBuffer()) // change to ArrayBuffer.then (webassembly.instantiate) // Then (mod => {// Call the f function on the module instance to compute console.log(mod.instance.f(50)); });Copy the code
There is a new built-in type, i32, which AssemblyScript builds on TypeScript. AssemblyScript is a subset of TypeScript. AssemblyScript adds more stringent type restrictions to TypeScript to make it easier to compile into WebAssembly.
- More detailed built-in types than TypeScript to optimize performance and memory footprint, detailed documentation;
- You can’t use any, undefined, and enumerated types.
- Nullable variables must be reference types, not primitive data types such as string, number, or Boolean.
- Optional arguments in a function must have default values. The function must have a return type. The return type of a function with no return value must be void.
- You cannot use built-in functions in the JS environment, only those provided with AssemblyScript.
AssemblyScript in general has a lot more limitations than TypeScript, so it feels very limited to write; Using AssemblyScript to write a WebAssembly will often cause errors when TSC compiles but runs the WebAssembly. This is probably because you don’t follow these restrictions. AssemblyScript, however, catches most errors at compile time by modifying the TypeScript compiler’s default configuration.
AssemblyScript uses LLVM to parse TS source code into AST through TypeScript compiler, translate AST into IR, and compile WebAssembly bytecode through LLVM. The restrictions mentioned above are intended to facilitate the conversion of AST to LLVM IR.
Why AssemblyScript as the WebAssembly development language
AssemblyScript has the advantage over C, Rust and other languages of writing WebAssembly. In addition to having no additional cost of learning a new language for the front-end, AssemblyScript can also be useful for browsers that do not support WebAssembly. A smooth migration from JS to WebAssembly can be achieved by compiling JS code into properly executed JS code using the TypeScript compiler.
Access the Webpack build
Any new Web development technology requires a build process, and in order to provide a smooth WebAssembly development process, the following steps will be taken to integrate Webpack.
1. Install the following dependencies so that the TS source code is compiled into WebAssembly by AssemblyScript.
{ "devDependencies": { "assemblyscript": "github:AssemblyScript/assemblyscript", "assemblyscript-typescript-loader": "^ 1.3.2 typescript", ""," ^ 2.8.1 ", "webpack" : "^ 3.10.0", "webpack - dev - server" : "^ 2.10.1"}}Copy the code
2. Add webpack.config.js to loader:
module.exports = {
module: {
rules: [
{
test: /\.ts$/,
loader: 'assemblyscript-typescript-loader',
options: {
sourceMap: true,
}
}
]
},
};Copy the code
Modify the TypeScript compiler configuration tsconfig.json so that the TypeScript compiler supports the built-in types and functions introduced in AssemblyScript.
{ "extends": ".. /.. /node_modules/assemblyscript/std/portable.json", "include": [ "./**/*.ts" ] }Copy the code
4. The configuration is directly inherited from the built-in configuration file of AssemblyScript.
Webassembly-related file formats
Wasm is a binary file format for WebAssembly that is unreadable by the human eye. To read the logic of WebAssembly files, there is a text format called WAST. In the case of the module that calculates the Fibonacci sequence described earlier, the corresponding WAST file is as follows:
func $src/asm/module/f (param f64) (result f64)
(local i32)
get_local 0
f64.const 1
f64.eq
tee_local 1
if i32
get_local 1
else
get_local 0
f64.const 2
f64.eq
end
i32.const 1
i32.and
if
f64.const 1
return
end
get_local 0
f64.const 1
f64.sub
call 0
get_local 0
f64.const 2
f64.sub
call 0
f64.add
endCopy the code
This is very much like assembly language, where f64 is a data type and f64.eq f64.sub f64.add is a CPU instruction.
To convert the binary file format wASM into human-eye wAST text, you need to install the WebAssembly binary toolkit WABT, which can be installed on Mac via Brew Install WABT. After the installation is successful, you can run wasm2wast f. asm to obtain wAST. In addition, wast2wasm F.wast-O. Asm can be converted back to wast2Wasm.
Webassembly-related tools
In addition to the WebAssembly binary toolkit mentioned earlier, the WebAssembly community has the following common tools:
- Emscripten: can convert C and C++ code into WASM, ASm.js;
- Binaryen: Provides a cleaner IR, converts IR to WASM, and provides wASM compile-time optimization, WASM virtual machine, wASM compression, etc. The aforementioned AssemblyScript is based on it.
WebAssembly JS API
Currently WebAssembly can only be loaded and executed with JS, but in the future it will be possible to load and execute WebAssembly in a browser as if it were loaded with JS. Here’s how to call WebAssembly with JS in more detail.
Loading bytecode > compiling bytecode > instantiating WebAssembly. After obtaining the WebAssembly instance, it can be called through JS. The above three steps are specific operations:
- For browsers, bytecode files can be loaded by network request, and for Nodejs, bytecode files can be read by FS module.
- After the bytecode is obtained, it needs to be converted into an ArrayBuffer before it can be compiled. The JS API WebAssembly.com compile will resolve a Webassembly. Module through the Promise. This Module cannot be called directly.
- The webassembly. Instance API is used to instantiate the module after obtaining the Instance. After obtaining the Instance, it can be called as if using the JS module.
Steps 2 and 3 can be done in one step, and the aforementioned WebAssembly.Instantiate does both of these things.
WebAssembly.instantiate(bytes).then(mod=>{
mod.instance.f(50);
})Copy the code
WebAssembly adjustable JS
The previous examples used JS to call the WebAssembly module, but in some scenarios you might need to call the browser API from the WebAssembly module. Here’s how to call JS from the WebAssembly module.
The webassembly.instantiate function supports the second parameter webassembly.instantiate (bytes,importObject), The importObject argument passes JS to the WebAssembly module that needs to call JS. Call the window.alert function in WebAssembly to display the result of the calculation. To do this, change the javascript code that loads the WebAssembly module:
WebAssembly.instantiate(bytes,{
window:{
alert:window.alert
}
}).then(mod=>{
mod.instance.f(50);
})Copy the code
You also need to modify the source code of AssemblyScript:
Declare namespace window {export function alert(v: number): void; // Declare namespace window {export function alert(v: number): void; } function _f(x: number): number { if (x == 1 || x == 2) { return 1; } return _f(x-1) + _f(x-2)} export function f(x: number): void {return _f(x-1) + _f(x-2)} export function f(x: number): void { }Copy the code
The AssemblyScript file contains several lines more than the first AssemblyScript file:
(import "window" "alert" (func $src/asm/module/window.alert (type 0)))
(func $src/asm/module/f (type 0) (param f64)
get_local 0
call $src/asm/module/_f
call $src/asm/module/window.alert)Copy the code
The extra wast code is the logic in AssemblyScript to call the module passed in JS.
In addition to the common apis mentioned above, WebAssembly also provides some apis. You can see all the WebAssembly JS apis in detail in this d.ts file.
More than just browsers
As a kind of low-level bytecode, WebAssembly can run in other environments besides browsers.
Execute the WASM binary directly
Brew Install Binaryen on a Mac operating system. After Binaryen is successfully installed on a Mac operating system, you can run it directly from the wASM -shell f. asm file.
Run it in Node.js
WebAssembly support has been added to the V8 JS engine. Chrome and Node.js use V8 as the engine, so WebAssembly can run in the Node.js environment.
When the V8 JS engine runs WebAssembly, the WebAssembly and JS are executed in the same VIRTUAL machine instead of the WebAssembly running in a separate virtual machine, which facilitates the mutual invocation between JS and WebAssembly.
To get the above example to run in Node.js, use the following code:
const fs = require('fs'); function toUint8Array(buf) { var u = new Uint8Array(buf.length); for (var i = 0; i < buf.length; ++i) { u[i] = buf[i]; } return u; } function loadWebAssembly(filename, imports) { Const buffer = toUint8Array(fs.readfilesync (filename)); Return webassembly.compile (buffer). Then (module => {return new webassembly.instance (module, imports) }) } loadWebAssembly('.. / temp/assembly/module. Wasm '). Then (instance = > {/ / call f function calculating the console log (instance. Exports. F (10))});Copy the code
Running WebAssembly in a Nodejs environment doesn’t make much sense because Nodejs supports running native modules, which perform better than WebAssembly. If you are writing WebAssembly through C and Rust, you can compile directly into native modules that Nodejs can call.
WebAssembly outlook
As you can see from the above, WebAssembly is mainly designed to solve THE PERFORMANCE bottleneck of JS. In other words, WebAssembly is suitable for scenarios that require a lot of computation, such as:
- For audio and video processing in the browser, flv.js can be rewritten with WebAssembly to greatly improve performance;
- React dom Diff involves a lot of computation, so using WebAssembly to rewrite the React core module can improve performance. JavaScriptCore, the JAVASCRIPT engine used by Safari, also supports WebAssembly, and RN applications have improved performance.
- Breaking through the performance bottleneck of large 3D web games, Egret Engine has begun to explore the use of WebAssembly.
conclusion
Although the WebAssembly standard has been finalized and implemented by major browsers, the following issues remain:
- Browser compatibility is not good, only the latest version of the browser support, and different browsers to JS WebAssembly interoperability API support is inconsistent;
- The ecosystem tools are not perfect and mature. We can’t find a language to write WebAssembly smoothly at present. It’s still in its infancy.
- There are too few learning materials, and more people are needed to explore and step on pits. ;
In short, WebAssembly is not yet mature, and if your team doesn’t have intolerably high performance issues, it’s not the time to implement WebAssembly into your production, because it might interfere with your team’s productivity or block development with a pothole that can’t be easily resolved.
The resources
- Introduction to ASM.js and Emscripten
- Introduction to the structured compiler front end Clang
- Understand the WebAssembly text format
- Use the WebAssembly JavaScript API