Build a more secure sandbox environment for Node.js applications

What are the scenarios for dynamic script execution?

In some applications, we want to provide users with the ability to insert custom logic, such as VBA in Microsoft Office, Lua scripts in some games, and “Oil Monkey scripts” in FireFox, to allow users to do fun and useful things with their imagination within the scope of control and permissions. Expanded capabilities to meet the personalized needs of users.

Most of these are client-side applications, and similar requirements are often found in online systems and products. In fact, there are many online applications that offer the ability to customize scripts, such as Apps Script in Google Docs. It lets you do some very useful things with JavaScript, such as running code in response to document opening events or cell change events, making custom spreadsheet functions for formulas, and so on.

And run in “user’s computer” client application, the user’s custom scripts usually only affect users themselves, and for online application or service, there are some situation becomes more important, such as “safe”, the user’s “custom scripts” must be strictly restricted and isolation, which cannot affect the host program, also cannot affect the other users.

Safeify is a module for securely executing user-defined untrusted scripts for Nodejs applications.

How to execute dynamic scripts safely?

Let’s start by looking at how you can dynamically execute a piece of code in a JavaScript program in general, okay? Take eval at the top of your name

eval('1 + 2')
Copy the code

The above code no problem carried out smoothly, the eval is a function of global object attribute, execute code with and other normal code in the application of permissions, it can access local variables of the execution context, can also access all “global variables”, in this scenario, it is a very dangerous function.

Using Functon, we can create a Function dynamically and then execute it

const sum = new Function('m'.'n'.'return m + n');
console.log(sum(1.2));
Copy the code

It works just as well. Functions generated using the Function constructor do not create closures in the context in which they were created, but are generally created in the global scope. When a Function is run, it can access only its own local and global variables, not the scope of the context in which the Function constructor was called. As one stands on the ground and the other on a thin sheet of paper, in this scene, there is almost no superior or inferior.

Incorporate new ES6 featuresProxyIt would be safer

function evalute(code,sandbox) {
  sandbox = sandbox || Object.create(null);
  const fn = new Function('sandbox'.`with(sandbox){return (${code})} `);
  const proxy = new Proxy(sandbox, {
    has(target, key) {
      // Make dynamically executing code think the property already exists
      return true; }});return fn(proxy);
}
evalute('1 + 2') / / 3
evalute('console.log(1)') // Cannot read property 'log' of undefined
Copy the code

We know that eval and function are executed by looking up the scope layer by layer, and if they can’t find it all the way up to global, the principle of Proxy is that the executing code can find it in SandobX to prevent it from escaping.

In the browser, you can also use iframe to create a more secure isolation environment. This article focuses on Node.js and won’t discuss it too much here.

In Node.js, there are no other options

VM is a built-in module that comes with Node.js by default. The VM module provides a set of apis for compiling and running code in a V8 virtual machine environment. JavaScript code can be compiled and run immediately, or compiled, saved, and run again.

const vm = require('vm');
const script = new vm.Script('m + n');
const sandbox = { m: 1.n: 2 };
const context = new vm.createContext(sandbox);
script.runInContext(context);
Copy the code

Execution of this code results in a result of 3, and vm.Script can also specify the “maximum number of milliseconds” for which the code executes, terminating execution and throwing an exception

try {
  const script = new vm.Script('while(true){}', {timeout: 50}); . }catch (err){
  // Prints the timeout log
  console.log(err.message);
}
Copy the code

The above Script execution will fail, detect a timeout and throw an exception, and then be caught by the Try Cache and log, but note that the vm.Script timeout option is “only valid for synchronous generations”, not for asynchronous calls, such as

  const script = new vm.Script('setTimeout(()=>{},2000)', {timeout: 50}); .Copy the code

The above code does not throw an exception after 50ms, because the code above 50ms will be synchronous execution, and the setTimeout time does not count, that is, the vm module can not directly limit the execution time of asynchronous code. We also cannot check the timeout with an additional timer because there is no way to abort the VM after checking the execution.

In addition, node.js may appear to isolate the code execution environment through vm.runInContext, but it is actually very easy to “escape” from it.

const vm = require('vm');
const sandbox = {};
const script = new vm.Script('this.constructor.constructor("return process")().exit()');
const context = vm.createContext(sandbox);
script.runInContext(context);
Copy the code

The sandbox was created outside of the VM, so this of the VM code also points to the Sandbox

// This. constructor is an external Object constructor
const ObjConstructor = this.constructor; 
//ObjConstructor's constructor is an outsourced Function
const Function = ObjConstructor.constructor;
// Create a function and execute it, returning the global process global object
const process = (new Function('return process'()));Exit the current process
process.exit(); 
Copy the code

No one wants users to be able to kill an app with a script. You can actually do more than just get out of the program.

There is a simple way to avoid getting processes from this.constructor as follows:

const vm = require('vm');
// Create a blank object without proto as a sandbox
const sandbox = Object.create(null);
const script = new vm.Script('... ');
const context = vm.createContext(sandbox);
script.runInContext(context);
Copy the code

There are risks, however, and because of the dynamic nature of JavaScript, black magic can be difficult to protect against. The official documentation for node.js also says “don’t treat the VM like a secure sandbox to execute arbitrary untrusted code.”

What community modules have been further worked on?

There are open source modules in the community that run untrusted code, such as Sandbox, VM2 and ignoring. Comparatively speaking, VM2 has done more security work on all aspects, which is relatively safe.

As you can see from vm2’s official README, it builds on the VM module built into Node.js to set up the basic sandbox environment, and then uses ES6 Proxy technology described above to prevent sandbox script escape.

Use the same test code to try vm2

const { VM } = require('vm2');
new VM().run('this.constructor.constructor("return process")().exit()');
Copy the code

The above code does not successfully terminate the host program. “VM2 is a sandbox that can execute untrusted code on full in Node.js,” the official REAME of VM2 states.

However, there are some “bad” things we can do, such as:

const { VM } = require('vm2');
const vm = new VM({ timeout: 1000.sandbox: {}});
vm.run('new Promise(()=>{})');
Copy the code

The above code will never finish executing, just like the node.js built-in module, vm2 timeout is not valid for asynchronous operations. Also, VM2 cannot pass an additional timer to check for timeouts because it has no way to terminate the VM in execution. This will drain the server’s resources and cause your application to fail.

So maybe you’re thinking, can we ban promises by putting a fake Promise in the sandbox up here? The answer is to provide a “fake” Promise, but there is no way to complete the ban, such as

const { VM } = require('vm2');
const vm = new VM({ 
  timeout: 1000.sandbox: { Promise: function(){}}}); vm.run('Promise = (async function(){})().constructor; new Promise(()=>{}); ');
Copy the code

Constructor = (async function(){})().constructor). On another level, there may be times when we want custom scripts to support asynchronous processing.

How to build a safer sandbox?

Through the above exploration, we did not find a perfect solution to build secure and isolated sandboxes in Node.js. Vm2 has done a lot of processing, relatively speaking is a safer scheme, but the problem is also very obvious, such as asynchronous can not check the timeout problem, and the host program in the same process.

Without process isolation, a SANbox created from the VM looks something like this

So, can we try to isolate the untrusted code from the vm2 module in a separate process? Then, when a timeout is executed, the quarantined process is killed directly, but there are several issues to consider here

Manage sandbox processes through unified process pool scheduling

If a task is executed and a process is created, the overhead of processing the process is already too large, and there is no need to open a new process and grab resources from the host application. Therefore, we need to create a process pool, and create a Script instance when all tasks come, and enter a pending queue first. Then directly return the defer object of the script instance, and the calling place can await the execution result. Then the Sandbox Master schedules the execution according to the idle program of the project process. The master will send the execution information of the script, including the important ScriptId, It is sent to the idle worker, and the worker will send the “result + script information” back to the master after execution. The master will identify which script has finished execution through ScriptId, and then resolve or reject the result.

In this way, the “process pool” can reduce the overhead of “process creation and destruction”, and ensure that the host resources are not over-occupied. It can also kill the project process if the asynchronous operation times out. At the same time, the master will discover that a project process has died and immediately create a replacement process.

The data and results of the process, and the methods exposed to the sandbox

How processes communicate requires “dynamic code” processing data can be serialized directly and sent to the isolated Sandbox process via IPC. Execution results are also serialized and transmitted via IPC.

Among other things, if you want to expose a method to Sandbox, it is not easy to pass a reference to a solution to sandbox because it is not in the same process. We can transform the host method into a “description object”, including the method information that sandbox is allowed to call, and then send the information to the worker process like any other data. When the worker receives the data, Identify the “method description object”, and then establish the proxy method on the Sandbox object in the worker process. The proxy method also communicates with the master through IPC.

Limit CPU and memory quotas for sandbox processes

In Linux platform, through CGoups to sandbox process overall CPU and memory quota, Cgroups is the abbreviation of Control Groups, The Linux kernel provides a mechanism for limiting, logging, and isolating physical resources (such as CPU, Memory,IO, and so on) used by Process Groups. Originally developed by Google engineers, it was later incorporated into the Linux kernel. Cgroups is also the resource management means used by LXC to realize virtualization. It can be said that there is no LXC without Cgroups.

Eventually, we set up a sandbox environment that looks something like this

Does this feel like a hassle? But then we have a much safer sandbox environment, these processes. The authors have written in TypeScript and packaged it as a separate module, Safeify.

Compared to the built-in VM and several common sandbox modules, Safeify has the following features:

Create a dedicated process pool for dynamic code to be executed, separate from the host application and executed in a separate process
You can set the maximum number of processes in a sandbox process pool
Support for limiting the maximum execution time for synchronous code as well as for limiting the execution time for asynchronous code
Support for limiting the overall CPU resource quota of the sandbox process pool (decimal)
Support for limiting the maximum memory limit (in m) for the entire sandbox process pool

Making: https://github.com/Houfeng/safeify, welcome Star & Issues

Finally, a brief description of how Safeify works, with the following command install

npm i safeify --save
Copy the code

It’s easier to use in an application like this (in TypeScript)

import { Safeify } from './Safeify';

const safeVm = new Safeify({
  timeout: 50.// the timeout period is 50ms by default
  asyncTimeout: 500.// the timeout period for asynchronous operations is 500ms by default
  quantity: 4.// Number of sandbox processes, the default is the same as the number of CPU cores
  memoryQuota: 500.// Maximum memory used by the sandbox (unit: m). The default memory is 500m
  cpuQuota: 0.5.// CPU resource quota of the sandbox (percentage). The default value is 50%
});

const context = {
  a: 1.b: 2,
  add(a, b) {
    returna + b; }};const rs = await safeVm.run(`return add(a,b)`, context);
console.log('result',rs);
Copy the code

Questions about the security, the most secure, not only safer, Safeify has been used in a project, but the function of the custom script is only for network users, there are many dynamic execution code scenario actually can be avoided, not open around or just need to provide this functionality, hope that this article or Safeify can help you.

— end —

Build a more secure sandbox environment for Node.js applications

What are the scenarios for dynamic script execution?

How to execute dynamic scripts safely?

How to build a safer sandbox?

Related Posts

Webpack – devServer development environment basic configuration

Start with “Why” : A one-word mantra to unlock organizational change

Please use es5 method to convert class array to array.