Web technology is advancing by leaps and bounds, but there is one area where —- gaming has been unable to break through.

The performance requirements are so high that some of the biggest games can be hard to run on a PC, let alone in a browser sandbox! However, despite the difficulty, many developers have not given up trying to get 3D games to run in browsers.

In 2012, Mozilla engineer Alon Zakai was working on the LLVM compiler when he had an idea: A lot of 3D games are written in C/C++. If you could compile C/C++ into JavaScript code, wouldn’t they run in a browser? It is well known that the basic syntax of JavaScript is highly similar to that of C.

So he began to research how to achieve this goal, creating a compiler project called Emscripten. This compiler can compile C/C++ code into JS code, but not regular JS, but a variant of JavaScript called ASM.js.

This article will introduce the basic usage of ASM.js and Emscripten and show you how to convert C/C++ to JS.

ASM.js

1.1 the principle

Compiling C/C++ into JS has two major difficulties.

  • C/C++ is statically typed, while JS is dynamically typed.
  • C/C++ is manual memory management, while JS relies on garbage collection.

Asm.js is designed to solve both of these problems: its variables are uniformly statically typed, and it eliminates garbage collection. Other than that, it’s no different from JavaScript, which means that ASM.js is a strict subset of JavaScript and only uses a subset of the latter’s syntax.

Once the JavaScript engine realizes it’s running asm.js, it knows it’s optimized code and can skip the parsing step and go straight to assembly language. In addition, the browser also calls WebGL to execute ASM.js via the GPU, which is a different execution engine from normal JavaScript scripts. These are all reasons why ASM.js runs faster. Asm.js is said to run about 50 percent faster in browsers than native code.

Here are two syntax features of ASM.js in turn.

1.2 Statically typed variables

Asm.js provides only two data types.

  • 32 – bit signed integer
  • 64 – bit signed floating point number

Other data types, such as strings, booleans, or objects, are not provided by ASM.js. They are stored in memory as numeric values and are called via TypedArray.

If a variable’s type is to be determined at run time, asM.js requires that the type be declared and not changed, thus saving time for type determination.

Asm. Js type declarations have fixed method and variable | 0 means integer, + variable said floating point number.

var a = 1; var x = a | 0; // x is a 32-bit integer. // y is a 64-bit floating point numberCopy the code

In the code above, the variable x is declared as an integer and y as a floating-point number. Support the asm. Js engine saw | 0 x = a, will know that x is an integer, and then use the asm. The mechanism of js processing. It doesn’t matter if the engine doesn’t support ASM.js, the code will still run and get the same result.

Look at the following example.

Var first = 5; var second = first; Var first = 5; var second = first | 0;Copy the code

In the above code, the first is plain JavaScript, and the variable second is only known at runtime, which makes it slow. The second is asm.js, and second is declared as an integer, which makes it faster.

Function parameters and return values are typed in this way.

function add(x, y) {
  x = x | 0;
  y = y | 0;
  return (x + y) | 0;
}
Copy the code

In the above code, in addition to the arguments x and y, the return value of the function also needs to be typed.

1.3 Garbage collection mechanism

Asm.js has no garbage collection mechanism and all memory operations are controlled by the programmer himself. Asm.js reads and writes to memory directly through TypedArray.

The following is an example of reading and writing directly to memory.

var buffer = new ArrayBuffer(32768);
var HEAP8 = new Int8Array(buffer);
function compiledCode(ptr) {
  HEAP[ptr] = 12;
  return HEAP[ptr + 4];
}  
Copy the code

If Pointers are involved, do the same.

size_t strlen(char *ptr) { char *curr = ptr; while (*curr ! = 0) { curr++; } return (curr - ptr); }Copy the code

The above code compiles to asm.js as follows.

function strlen(ptr) { ptr = ptr|0; var curr = 0; curr = ptr; while (MEM8[curr]|0 ! = 0) { curr = (curr + 1)|0; } return (curr - ptr)|0; }Copy the code

1.4 Similarities and Differences between ASM.js and WebAssembly

If you are familiar with JS, you probably know that there is a technology called WebAssembly that also converts C/C++ into code that the JS engine can run. So how is it different from ASM.js?

The answer is that they do basically the same thing except that the code they roll out is different: ASm.js is text and WebAssembly is binary bytecode, so it’s faster and smaller. In the long run, WebAssembly has a brighter future.

However, that doesn’t mean asM.js is definitely on the way out, because it has two advantages: first, it’s text, human readable, and intuitive; Second, all browsers support ASM.js and there are no compatibility issues.

Emscripten compiler

2.1 Emscripten profile

Although asM.js can be written by hand, it has always been the target language for the compiler to produce. Currently, the main tools for generating ASM.js areEmscripten.

At the bottom of Emscripten is the LLVM compiler. Theoretically, any language that can generate LLVM IR (Intermediate Representation) can compile asM.js. In practice, however, Emscripten is almost exclusively used to compile C/C++ code into ASM.js.

"LLVM in C/C++Copy the code

2.2 Installation of Emscripten

Emscripten installation is available according to the official documentation. I found it more convenient to install the SDK.

You can follow the steps below.

$ git clone https://github.com/juj/emsdk.git
$ cd emsdk
$ ./emsdk install --build=Release sdk-incoming-64bit binaryen-master-64bit
$ ./emsdk activate --build=Release sdk-incoming-64bit binaryen-master-64bit
$ source ./emsdk_env.sh
Copy the code

Notice that the last line is very important. Run the source. /emsdk_env.sh command every time you log in again or create a new Shell window.

2.3 Hello World

First, create a simple C++ program called hello.cc.

#include <iostream>

int main() {
  std::cout << "Hello World!" << std::endl;
}
Copy the code

Then, convert the program to asM.js.

$ emcc hello.cc
$ node a.out.js
Hello World!
Copy the code

In the above code, the emcc command is used to compile the source code, generating a.ut.js by default. Executing a.ut.js with Node will print Hello World on the command line.

Note that asM.js automatically executes the main function by default.

Emcc is a compilation command for Emscripten. It’s very simple to use.

Js $emcc hello.c # Generate hello.js $emcc hello.c -o hello.js # generate hello.html and hello.js $emcc hello.c -o hello.htmlCopy the code

Emscripten grammar

3.1 C/C++ Invokes JavaScript

Emscripten allows C/C++ code to call JavaScript directly.

Create a new file example1.cc and write the following code.

#include <emscripten.h> int main() { EM_ASM({ alert('Hello World! '); }); }Copy the code

EM_ASM is a macro that calls embedded JavaScript code. Note that the JavaScript code is written inside curly braces.

Then, compile this program into asm.js.

$ emcc example1.cc -o example1.html
Copy the code

The browser opens example1.html and the Hello World! Dialog box pops up. .

3.2 C/C++ and JavaScript communication

Emscripten allows C/C++ code to communicate with JavaScript.

Create a new file example2.cc and write the following code.

#include <emscripten.h>
#include <iostream>

int main() {
  int val1 = 21;
  int val2 = EM_ASM_INT({ return $0 * 2; }, val1);

  std::cout << "val2 == " << val2 << std::endl;
}
Copy the code

In the above code, EM_ASM_INT means that the JavaScript code is returning an integer, and $0 in its arguments is the first argument, $1 is the second argument, and so on. The other arguments to EM_ASM_INT are passed in JavaScript expressions, in order.

Then, compile this program into asm.js.

$ emcc example2.cc -o example2.html
Copy the code

When the browser opens web page example2.html, val2 == 42 is displayed.

3.3 EM_ASM macro series

Emscripten provides the following macros.

  • EM_ASM: Calls JS code with no arguments and no return value.
  • EMASMARGS: calls JS code, which can have any arguments, but has no return value.
  • EMASMINT: calls JS code, can have any arguments, return an integer.
  • EMASMDOUBLE: Calls JS code that can take any arguments and returns a DOUBLE.
  • EMASMINT_V: Calls JS code with no arguments and returns an integer.
  • EMASMDOUBLE_V: Calls JS code with no arguments and returns a double-precision floating point number.

Here is an example of EM_ASM_ARGS. Create a new file example3.cc and write the following code.

#include <emscripten.h> #include <string> void Alert(const std::string & msg) { EM_ASM_ARGS({ var msg = Pointer_stringify($0); alert(msg); }, msg.c_str()); } int main() { Alert("Hello from C++!" ); }Copy the code

In the above code, we pass a string into the JS code. Since no value is returned, EM_ASM_ARGS is used. Also, we all know that in C/C++ a string is an array of characters, so we call the Pointer_stringify() method to turn the array into a JS string.

Next, convert the program to asM.js.


$ emcc example3.cc -o example3.html
Copy the code

When the browser opens example3. HTML, the dialog box “Hello from C++!” pops up. .

3.4 JavaScript Calls C/C++ code

JS code can also call C/C++ code. Create a new file example4.cc and write the following code.


#include <emscripten.h>

extern "C" {
  double SquareVal(double val) {
    return val * val;
  }
}

int main() {
  EM_ASM({
    SquareVal = Module.cwrap('SquareVal', 'number', ['number']);
    var x = 12.5;
    alert('Computing: ' + x + ' * ' + x + ' = ' + SquareVal(x));
  });
}
Copy the code

In the above code, EM_ASM executes JS code with a C language function SquareVal. This function must be defined in a block of extern “C” code, and the JS code also introduces this function with the module.cwrap () method.

Module.cwrap() takes three arguments, with the following meanings.

  • The name of the C function, in quotes.
  • C The type of the value returned by the function. If there is no return value, the type can be written asnull.
  • An array of function parameter types.

In addition to module.cwrap (), there is a module.ccall () method that calls C functions from within JS code.

Var result = module. ccall('int_sqrt', // C function name 'number', // return type ['number'], // array of parameter types [28] // array of parameters);Copy the code

Returning to the previous example, now compile example4.cc to asm.js.


$  emcc -s EXPORTED_FUNCTIONS="['_SquareVal', '_main']" example4.cc -o example4.html
Copy the code

Note that the compiler gives an array of exported function names using the -s EXPORTED_FUNCTIONS parameter, preceded by an underscore. This example outputs only two C functions, so write [‘_SquareVal’, ‘_main’].

When a browser opens example4.html, it will see a pop-up dialog that displays the following.


Computing: 12.5 * 12.5 = 156.25 
Copy the code

3.5 C functions output as JavaScript modules

The other case is to output C functions to be called by JavaScript scripts inside a web page. Create a new file example5.cc and write the following code.

extern "C" { double SquareVal(double val) { return val * val; }}Copy the code

In the code above, SquareVal is a C function that can be printed outside the extern “C” block.

Then, compile the function.


$ emcc -s EXPORTED_FUNCTIONS="['_SquareVal']" example5.cc -o example5.js
Copy the code

In the above code, the -s EXPORTED_FUNCTIONS parameter tells the compiler the names of the functions that need to be exported in the code. The function name is preceded by an underscore.

Next, write a web page that loads the example5.js you just generated.

<! DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> <body> <h1>Test File</h1> <script type="text/javascript" src="example5.js"></script> <script> SquareVal = Module.cwrap('SquareVal', 'number', ['number']); document.write("result == " + SquareVal(10)); </script> </body>Copy the code

When the browser opens the page, it will see result == 100.

3.6 Node Calls the C function

If the execution environment is not a browser, but Node, it is much easier to call C functions. Create a new file example6.c and write the following code.

#include <stdio.h> #include <emscripten.h> void sayHi() { printf("Hi! \n"); } int daysInWeek() { return 7; }Copy the code

Then, compile the script into asm.js.


$ emcc -s EXPORTED_FUNCTIONS="['_sayHi', '_daysInWeek']" example6.c -o example6.js
Copy the code

Next, write a Node script called test.js.


var em_module = require('./api_example.js');

em_module._sayHi();
em_module.ccall("sayHi");
console.log(em_module._daysInWeek());
Copy the code

In the code above, the Node script can call C functions in two ways: em_module._sayhi () with the underscore function name, and em_module.ccall(“sayHi”) with the ccall method.

Run the script and you can see the output from the command line.


$ node test.js
Hi!
Hi!
7
Copy the code

Fourth, use

Asm.js not only lets the browser run 3D games, but also various server software such as Lua, Ruby, and SQLite. This means that many tools and algorithms can use off-the-shelf code without having to write them all over again.

In addition, because asM.js is faster, some computation-intensive operations (such as calculating hashes) can be implemented in C/C++ and then invoked in JS.

For a real transcoding example, see how gzlib is compiled and how its Makefile is written.

Reprinted fromwww.ruanyifeng.com/blog/2017/0…