Some background on “fried rice”
This article does not cover the basics of Web Workers and the use of basic apis from scratch. If you are not familiar with Web Workers, please refer to the W3C standard Workers documentation.
Since the release of the HTML5 recommendation in 2014, HTML5 has added more and more powerful features and functions. Among them, the concept of Web workers has made a splash, but it hasn’t made much of a splash. It was swamped by the “revolution” of Angular, Vue, React frameworks on its engineering side. Of course, we have read some articles by chance, or we have done some exercises in application scenarios for learning purposes, or we have even used them in real projects that involve large amounts of data calculation. But I believe there are many people like me who are at a loss to find out which applications of this lofty technology can play a broad role in actual project scenarios.
The reason is that the feature of Web Worker running independently of UI main thread makes it widely considered to try performance optimization (such as some scenes such as image analysis and 3D calculation drawing), so as to ensure that the page can respond to users in a timely manner while performing a lot of calculations. On the one hand, these performance optimization requirements involve low frequency at the front end, and on the other hand, they can be solved through microtasks or server-side processing. It cannot bring qualitative changes to the optimization of polling scenes under the front end page like Web Socket technology.
It was not until the emergence of the explosive micro-front-end architecture in 2019, based on the need for JavaScript sandbox isolation between micro-applications, that Web Worker was able to leap from a marginalized position to my central vision. According to the knowledge I have learned about Web workers, I know that Web workers work under an independent sub-thread (although this sub-thread is a little weaker than Java and other compiled language sub-threads, for example, it cannot be locked, etc.), and there is isolation between threads. Can JavaScript runtime isolation be implemented based on this “physical” isolation?
The rest of this article will cover some of my data gathering, understanding, stomp holes and thinking as I explore a Web Worker-based JavaScript sandbox isolation solution. Although the whole article may be “rehash”, I still hope that my process of exploring the solution can be helpful to you who are reading this article.
The JavaScript sandbox
Before exploring a Web worker-based solution, we need to understand the problem at hand — the JavaScript sandbox.
When I think of sandboxes, I first think of sandbox games that I’ve played for fun, but the JavaScript sandbox we’re going to explore is different from sandbox games, which focus on abstracting, combining, and implementing systems of physical forces from the basic elements of the world, whereas JavaScript sandboxes focus more on isolating operational states when using shared data.
In the real world of JavaScript, we know that the browser is a sandbox, and JavaScript code running in the browser has no direct access to the file system, display, or any other hardware. Each TAB in Chrome is also a sandbox. The data in each TAB cannot interact with each other directly, and the interface runs in a separate context. What are the more detailed scenarios that sandbox the need for running AN HTML page within the same browser TAB?
When we have been front-end developers for a long time, we can easily think of many application scenarios using sandbox requirements on the same page, such as:
- When executing third-party JavaScript code obtained from untrusted sources (such as importing plug-ins, processing data returned from JSONP requests, etc.).
- Online code editor scenarios (such as the famous CodesandBox).
- Use a server-side rendering scheme.
- Evaluation of expressions in template strings.
- . .
Let’s go back to the beginning and assume that I’m working on a micro front-end architecture design. In the Microfrontend architecture, one of the most critical designs is the implementation of scheduling between sub-applications and the maintenance of their running state. Common requirements such as global event listening for each child application at run time and making global CSS styles work can become a polluting side effect when multiple child applications switch, and many of the later micro-front-end architectures (such as Qiankun) have various implementations to address these side effects. Examples of CSS isolation include namespace prefixes, Shadow DOM, and runtime dynamic adding and deleting of CSS. But the most troublesome is JavaScript sandbox isolation between micro-applications.
In the microfront-end architecture, JavaScript sandbox isolation needs to address the following issues:
- Global methods/variables hanging on Windows (such as setTimeout, scrollglobal event listeners, etc.) are cleaned and restored when child applications switch.
- Cookie, LocalStorage, and other read/write security policies.
- Implementation of independent routing for each sub-application.
- Independent implementation of multiple microapplications when they coexist.
Proxysandbox.ts and SnapshotSandbox.ts. They respectively implement proxy-based proxies for constants and methods commonly used on Windows, and perform backup restoration through snapshots when Proxy is not supported. Combined with its related open source articles to share, a simple summary of its implementation ideas: In the original version, the concept of snapshot sandbox was used to simulate Proxy API of ES6, which hijacked Window by Proxy. When the child application modifies or uses properties or methods on Window, the corresponding operation is recorded, and snapshots are generated every time the child application is mounted/unmounted. When the external switch to the current child application again, Later, in order to be compatible with the coexistence of multiple sub-applications, all global constants and method interfaces of Proxy were implemented based on Proxy, and an independent running environment was constructed for each sub-application.
Another idea worth learning is Browser VM of Ali Cloud development platform, whose core entry logic is in context.js file. Its specific implementation idea is as follows:
- reference
with
To wrap a layer of code for each child application code during the WebPack compilation and packaging phase (see its plug-in package)breezr-plugin-osCreate a closure and pass in your own simulated global objects like Window, Document, location, history, etc. (seeThe root directoryRelevant documents). - In the simulation,ContextNew an iframe object, providing an about:blank (about:blank) domain URL of the host application as the initial loading URL for the iframe. However, the history associated with the iframe cannot be manipulated, and the route transformation only supports hash mode), and then passes the native browser object under it
contentWindow
Because iframe objects are naturally isolated, the cost of implementing all of your own Mock apis is eliminated. - After fetching the native objects in the corresponding IFrame, continue to generate the corresponding Proxy for the specific objects to be isolated, and then obtain and set some attributes. Do some specific implementation (window.document needs to return a specific sandbox document instead of the current browser’s document, etc.).
- In order for document content to be loaded on the same DOM tree, for document, most DOM manipulation properties and methods are still handled directly using the properties and methods of the document in the host browser.
In general, the Browser VM implementation can be seen as part of the implementation of qiankun or other micro front-end architecture ideas, such as the common global object proxy and interception. And with the help of Proxy feature, read and write for Cookie, LocalStorage can also do some security policy implementation. When the iframe created for each child application is removed, variables written in the window under it, setTimeout, global event listener, etc., will also be removed. In addition, based on Proxy, DOM events are recorded in the sandbox and then removed during the life cycle of the host, enabling the entire JavaScript sandbox isolation mechanism to be implemented with a small development cost.
In addition to the solutions that are currently popular in the community above, I recently learned that Figma products in the UI design field also produce an isolation solution based on their plug-in system in an article exploring plug-in Architecture for Large Web applications. At first Figma also executed the plug-in code in iframe and communicated with the main thread via postMessage, but due to ease of use and the performance of postMessage serialization, Figma chose to execute the plug-in on the main thread. Figma’s approach is based on the Realm API, which is still in draft form, and compiles Duktape, a C++ implementation of the JavaScript interpreter, into WebAssembly and then empresses it into the Realm context. Realize the independent running of tripartite plug-ins under its products. This approach and exploring a Web worker-based implementation may work better together and will continue to be watched.
Web workers and DOM rendering
After understanding the “past and present” of JavaScript sandbox, we turn our attention back to the protagonist of this article – Web Worker.
As mentioned at the beginning of this article, the form of Web Worker child threads is also a natural sandbox isolation. The ideal way is to use Browser VM’s previous ideas to wrap a layer of code to create Worker objects for each child application in the compilation stage through the Webpack plug-in. Let the child app run in its corresponding single Worker instance, for example:
__WRAP_WORKER__('/* Package code */}');
function __WRAP_WORKER__(appCode) {
var blob = new Blob([appCode]);
var appWorker = new Worker(window.URL.createObjectURL(blob));
}
Copy the code
However, after understanding the implementation process of JavaScript sandbox under the micro front end, it is not difficult for us to find several difficulties that JavaScript sandbox will inevitably encounter in realizing the micro front end scene under the Web Worker:
- For thread-safe design considerations, Web workers do not support DOM operations and must be implemented through postMessage notification of the UI main thread.
- Web workers cannot access browser global objects such as Window and Document.
Other problems, such as Web Worker’s inability to access page global variables and functions, and inability to call BOM APIS such as Alert and confirm, are minor compared to the inability to access window and Document global objects. Fortunately, timer functions such as setTimeout and setInterval can be used normally in Web workers, and Ajax requests can still be sent.
Therefore, the first problem to solve is the DOM operation in a single Web Worker instance. First, we have a big premise: DOM cannot be rendered in Web workers, so we need to split DOM operations based on actual application scenarios.
React Worker DOM
Since the sub-applications in our micro front-end architecture are limited to the React technology stack, I first focused on solutions based on the React framework.
In React, we know the basic fact that the render phase is divided into two stages: Diff changes to the DOM tree and the actual render changes to the page DOM. Can we put the Diff process into the Web Worker? What if the render phase is communicated to the main thread via postMessage and then placed on the main thread? A simple search, quite ashamed, there has been a big guy in 5, 6 years ago to try. Here we can refer to the act-worker-dom open source code.
The implementation in react-worker-dom is clear. In common/channel.js, it uniformly encapsulates the interface for sub-thread and main thread to communicate with each other and the interface for serializing communication data. Then, we can see that the general entry file for realizing DOM logic processing under Worker is in the Worker directory, and we can trace it from this entry file. The workerbridge-js entry file, which uses postMessage to inform the main thread of rendering after calculating the DOM, and other DOM construction, Diff operations, and lifecycle Mock interfaces implemented by the React library. The entry file that receives the rendering event communication is in the page directory. After receiving the Node operation event, the entry file realizes the actual rendering update of DOM in the main thread combined with the interface code in workerDomNodePl.js.
Just a quick summary. Based on the React stack, we can achieve a certain degree of DOM sandbox by separating the Diff and render stages under the Web Worker, but this is not the JavaScript sandbox we want under the micro-front-end architecture. Not to mention the cost-benefit ratio of splitting the Diff phase from the rendering phase, first of all, many efforts based on the particularity of the technical stack framework will be difficult to control with the upgrade of the version of the framework itself; Secondly, if each sub-application uses different technology stack frameworks, it is necessary to encapsulate suitable interfaces for these different frameworks, which is weak in scalability and universality. Finally, and most importantly, this approach does not yet solve the problem of sharing resources under Windows, or it only starts to solve the problem.
Next, we will continue to discuss another way to realize DOM operation under Worker. Resource sharing at Windows will be discussed later.
AMP WorkerDOM
While I was starting to think about the many “gaps” in the actual development of ideas like react-worker-DOM, browsing other DOM frameworks with plugins also popped into my mind. It was Google’s AMP.
In the AMP open source project, in addition to the general Web Component framework such as AMphTML, there are many other projects that adopt new technologies such as Shadow DOM and Web Component. After a brief glance under the project, I am delighted to see the project worker-DOM.
A cursory look at the worker-dom source code shows that main-thread and worker-thread can be seen in the SRC root directory. After opening the two directories respectively, It can be found that the implementation of DOM splitting logic and DOM rendering is basically similar to the react-worker-dom, but the implementation of worker-DOM is closer to the bottom of DOM because it has nothing to do with the upper framework.
First look at the relevant codes of the worker-thread DOM logical layer. It can be seen that all relevant node elements, attribute interfaces, document objects and other codes based on DOM standards are implemented in the DOM directory below. Global properties and methods such as Canvas, CSS, event and Storage are also implemented in the upper directory.
On the one hand, the key function of main-thread is to provide an interface for loading worker files and rendering pages from the main thread. On the other hand, it can be understood from the codes of worker.ts and Nodes. ts files.
In worker.ts, as I originally envisioned, there is a layer of code that automatically generates worker objects and proxies all DOM operations in the code onto the simulated WorkerDOM object:
const code = `
'use strict';
(function(){
${workerDOMScript}
self['window'] = self;
var workerDOM = WorkerThread.workerDOM;
WorkerThread.hydrate(
workerDOM.document,
The ${JSON.stringify(strings)}.The ${JSON.stringify(skeleton)}.The ${JSON.stringify(cssKeys)}.The ${JSON.stringify(globalEventHandlerKeys)}[The ${window.innerWidth}.The ${window.innerHeight}].The ${JSON.stringify(localStorageInit)}.The ${JSON.stringify(sessionStorageInit)}
);
workerDOM.document[${TransferrableKeys.observe}](this);
Object.keys(workerDOM).forEach(function(k){self[k]=workerDOM[k]});
}).call(self);
${authorScript}
//# sourceURL=The ${encodeURI(config.authorURL)}`;
this[TransferrableKeys.worker] = new Worker(URL.createObjectURL(new Blob([code])));
Copy the code
In Nodes.ts, the construction and storage of real element nodes are implemented (based on whether and how the storage data structure is optimized in the rendering phase needs further study of the source code).
At the same time, the source code under the Transfer directory defines the message communication specifications of the logical layer and UI rendering layer.
In general, AMP WorkerDOM’s approach abandons the constraints of the upper-layer framework and is truly stack-independent by building all the RELEVANT DOM apis from the bottom. On the one hand, it can be fully implemented as the bottom layer of the upper-layer framework to support secondary encapsulation migration of various upper-layer frameworks (such as engineering AMP-React-Prototype). On the other hand, it combines the current mainstream JavaScript sandbox scheme. Some JavaScript sandbox isolation is implemented by emulating the window and Document global methods and proxying them to the main thread.
Of course, from my personal point of view, AMP WorkerDOM has some limitations in its current implementation. One is the migration cost of the current mainstream upper-layer frameworks such as Vue and React and the adaptation cost of the community ecology. The other is that there is no relevant implementation scheme under the single-page application, and a better scheme can not be found in the support of large PC micro-front-end applications.
AMP WorkerDOM is a WorkerDOM that works with AMP WorkerDOM and AMP WorkerDOM. In the subsequent process of rendering communication, it can be considered to combine the relevant implementation of Browser VM to generate an iframe object while generating Worker objects, and then send all operations under DOM to the main thread through postMessage. This is performed with the iframe cashed to which it is bound, and the actual DOM update is implemented by forwarding the concrete rendering implementation to the original workerDomNodeImp.js logic via the proxy.
Summary and some personal foresight
Let’s start with some personal summaries. The implementation of JavaScript sandbox under the micro-front-end architecture of Web Worker was initially a flash of personal inspiration. After in-depth investigation, although the community general scheme was eventually abandoned because the optimal solution could not be found due to various problems in the scheme implementation. However, I am still optimistic about Web Worker technology in implementing plug-in sandbox applications. Plug-in mechanism has always been a favorite design in the front-end field, from Webpack compilation tools to IDE development tools, from Web application level entity plug-in to application architecture design plug-in extension design, combined with WebAssembly technology, Web workers will undoubtedly play an important role in plug-in design.
Secondly, some forward thinking by some individuals. In fact, it can be seen from the investigation of DOM rendering implementation of Web Worker that, based on the idea of separating logic and UI, the subsequent architecture design of front-end has a great chance to produce certain changes. At present, no matter the prevailing Vue or React framework, its framework design, no matter MVVM or Flux combined with Redux, is still the framework design driven by View layer in essence (my humble opinion). While it has flexibility, it also produces problems such as performance optimization and difficulties in collaborative development after large-scale projects are upgraded. The separation of logic and UI based on Web Worker will promote further business stratification of the whole process of data acquisition, processing and consumption, thus solidifying a whole set of MVX design ideas.
Of course, I am still in the stage of preliminary investigation, so I still need to consider the immaturity. Listen and practice later.
Author: ES2049 / JIN Zhikai
The article can be reproduced at will, but please keep this link to the original text.
You are welcome to join ES2049 Studio. Please send your resume to [email protected].