This article is translated

The State Of Web Workers In 2021

Originally written by Surma

The original address: www.smashingmagazine.com/2021/06/web…

The Web is single-threaded. This makes it increasingly difficult to write smooth and sensitive applications. Web workers get a bad rap, but they are a very important tool for Web developers to solve fluency problems. Let’s take a look at Web workers.

We always compare the Web with so-called “Native” platforms like Android and iOS. The Web is streaming, and when you first open an application, there are no resources available locally. This is a fundamental difference that prevents many of the architectures available in Native from being easily applied to the Web.

However, no matter what field you focus on, you’ve probably used or understood multithreading. IOS allows developers to use the simple parallelization code of Grand Central Dispatch, while Android does the same with its new unified task scheduler, WorkManager, and the game engine Unity uses Job Systems. The platforms I’ve listed above not only support multithreading, they also make multithreaded programming as easy as possible.

In this article, I’ll outline why I think multithreading is important in the Web space, and then introduce multithreading primitives that we as developers can use. In addition, I’ll talk a little bit about architecture to make it easier for you to implement multithreaded programming (even incrementally).

Unpredictable performance issues

Our goal is to keep the application smooth and responsive. Smooth means stable and high enough frame rate. Sensitive means that the UI responds to user interactions with minimal latency. Both are key factors in keeping your application elegant and high quality.

According to RAIL’s model, agility means a response time of less than 100ms to user behavior, while fluency means anything on the screen moves at a steady 60 FPS. So, we as developers have 1000ms/60 = 16.6ms to generate each frame, which is also known as the “frame budget”.

I just mentioned “we”, but it’s actually “browser” that takes 16.6ms to do all the work behind rendering a frame. We developers are only directly responsible for part of the actual work of the browser. The browser’s work includes (but is not limited to) :

  • Detect elements that the user operates on.
  • Emit corresponding events;
  • Run the relevant JavaScript time handlers;
  • Computing style;
  • Do a layout
  • Draw the layer in paint
  • Combine these layers into a picture that the end user sees on the screen
  • (and more…)

That’s a lot of work.

On the other hand, the “performance gap” is widening. The performance of flagship phones is getting higher and higher with the updating of mobile phone products. Low-end models are getting cheaper, making the mobile Internet accessible to people who previously could not afford a phone. In terms of performance, these low-end phones are as good as the 2012 iPhone.

Applications built for the Web will run on a wide range of different devices with vastly different performance. The time it takes for JavaScript execution to complete depends on how fast the running code device is. Not only JavaScript, but other tasks performed by the browser, such as layout and paint, are also subject to device performance. A task that took 0.5ms on a modern iPhone might take 10ms on the Nokia 2. The performance of the user’s device is completely unpredictable.

Note: RAIL has been a guidance framework for six years now. One thing to note is that 60fps is actually just a placeholder for the native refresh rate of the user’s display device. For example, the new Pixel phone has a 90Hz screen while the iPad Pro has a 120Hz screen, which reduces the frame budget to 11.1ms and 8.3ms, respectively.

To complicate matters further, there is no better way to determine the refresh rate of running App devices than by measuring the time between requestAnimationFrame() callbacks.

JavaScript

JavaScript is designed to run in sync with the browser’s main render loop. Almost all Web applications follow this pattern. The downside of this design is that slow-executing JavaScript code blocks the browser rendering loop. JavaScript running synchronously with the browser’s main rendering loop can be interpreted as: if one of them is not complete, the other one cannot continue. To allow long-running tasks to run harmoniously in JavaScript, an asynchronous model based on callbacks and later promises was built.

To keep your application running smoothly, you need to make sure your JavaScript code runs along with all the other tasks the browser does (styling, layout, drawing…). Time does not add up to the frame budget of the device. To keep your application sensitive, you need to make sure that any given event handler doesn’t take more than 100ms to display changes on the device screen in a timely manner. It’s hard enough to do this on your own devices in development, and it’s almost impossible to do it on all devices.

The common advice is to “chunk your code,” which can also be called “yield to the browser.” The basic principle is the same: to give the browser a chance to move on to the next frame, you need to split the code into chunks of similar size, so that control is handed back to the browser for rendering when switching between blocks.

There are many ways to yield control to the browser, but none of them are particularly elegant. The recently proposed task scheduling API aims to expose this capability directly. However, even if we could use an API like await yieldToBrowser() (or something like that) to give control, the technology itself would still be flawed: To stay within your frame budget, you need to do business in chunks that are small enough, and your code needs to cede control at least once per frame.

Code that cedes control too often can cause scheduling tasks to become so expensive that it can have a negative impact on overall application performance. Combined with the “unpredictable device performance” I mentioned earlier, we can conclude that there is no single chunk size for all devices. This is problematic when trying to “code split” the UI business, because rendering a complete UI step by step by ceding control to the browser increases the overall cost of layout and drawing.

Web Workers

There is a way to break code execution that is synchronized with the browser render thread. We can move some code to a different thread. Once in a different thread, we can let continually running JavaScript code block without the complexity and cost of code splitting and ceding control. With this approach, the renderer doesn’t even notice that another thread is blocking. The API for doing this on the Web is the Web Worker. A Web Worker is created by passing in a separate JavaScript file path, which will be loaded and run in the newly created thread.

const worker = new Worker("./worker.js");
Copy the code

Before we go any further, it’s important to note that while Web Workers, Service Workers, and worklets are similar, they are not the same thing at all, and their purposes are different:

  • In this article, I only discuss Web Workers (often referred to simply as “Workers”). A Worker is a JavaScript scope that runs in a separate thread. Workers are generated (and owned) by a page.
  • ServiceWorkerIs aIn the short termA JavaScript scope that runs in a separate thread and acts as a proxy for all network requests made in the same page. Most importantly, you can implement arbitrarily complex caching logic using the Service Worker. In addition, you can use the Service Worker to further implement backend requests, push messages, and other features that don’t need to be associated with a particular page. It is similar to a Web Worker, except that a Service Worker has a specific purpose and additional constraints.
  • Worklet is an API that receives a strictly restricted independent JavaScript scope and can choose whether to run on a separate thread. The key to worklets is that browsers can move worklets between threads. AudioWorklet, CSS Painting API, and Animation Worklet are all examples of Worklet applications.
  • SharedWorker is a special Web Worker. Multiple tabs and Windows of the same origin can reference the same SharedWorker. This API is almost impossible to use through polyfill and is currently only implemented by Blink1. So, I won’t go into that in this article.

JavaScript is designed to run synchronously with the browser, meaning there is no concurrency to handle, which results in many apis exposed to JavaScript that are not thread-safe. For a data structure, thread-safe means that it can be accessed and manipulated by multiple threads in parallel without being corrupted.

This is typically done through mutexes. When one thread performs an operation, the mutex locks the other threads. Because browsers and JavaScript engines don’t handle lock-related logic, they can do more optimization to make code execute faster. On the other hand, the lack of locking causes workers to run in a completely isolated JavaScript scope, since any form of data sharing is problematic due to a lack of thread safety.

While Worker is a “thread” primitive for the Web, the word “thread” here is very different from that used in C++, Java, and other languages. The biggest difference is that relying on an isolated environment means that the Worker does not have access to other variables and code in the page it creates, and vice versa. The only way for data to communicate is by calling the API postMessage, which makes a copy of the passed message and fires the Message event at the receiver. Isolated environments also mean that workers don’t have access to the DOM and can’t update the UI in workers — at least without a lot of effort (like the worker-DOM of AMP).

Browser support for Web workers is universal, even in IE10. However, the usage rate of Web workers is still low, which I think is largely due to the special design of Worker API.

The concurrency model of JavaScript

If you want to apply Worker, you need to adjust the architecture of the application. JavaScript actually supports two different concurrency models, which are often categorized as “off-main-thread architecture.” Both models use Worker, but in very different ways, each with its own tradeoff strategy. These two models represent two directions of problem solving, and any application can find a better fit between the two.

Concurrency model #1: Actors

Personally, I tend to understand Worker as an Actor in the Actor model. The implementation of the Actor model in the programming language Erlang is arguably the most popular version. Each Actor can choose whether or not to run on a separate thread and keep all the data it manipulates. No other thread can access it, making rendering synchronization mechanisms like mutex unnecessary. Actors simply propagate information to other actors and respond to the information they receive.

For example, I think of the main thread as the Actor that owns and manages the DOM, or the entire UI. It is responsible for updating the UI and capturing events from outside input. There will also be an Actor responsible for managing the state of the application. DOM Actors translate low-level input events into application-level semantic events and pass these events to state actors. State actors modify state objects based on the events they receive, possibly using a state machine or even involving other actors. Once the state object is updated, the state Actor sends a copy of the updated state object to the DOM Actor. The DOM Actor then updates the DOM with the new state object. Paul Lewis and I explored actor-centric application architectures at the Chrome Development Summit in 2018.

Of course, this model is not without its problems. For example, every message you send needs to be copied. The time it takes to copy depends not only on the size of the message, but also on how the current application is running. In my experience, postMessage is usually “fast enough,” but it really doesn’t work well in some scenarios. Another issue is that migrating code to the Worker frees up the main thread, but at the same time has to pay for communication overhead, and the Worker may be busy executing other code before responding to your messages, which we need to consider to strike a balance. If you’re not careful, workers can negatively impact UI responses.

Very complex messages can be delivered via postMessage. The underlying algorithm (called “structured cloning”) can handle data structures with loops inside, even Maps and sets. However, it cannot handle functions or classes, because this code cannot be shared across scopes in JavaScript. Somewhat annoyingly, passing a function through postMessage throws an error, whereas a class that is passed is silently converted to a normal JavaScript object, losing all methods in the process (the details behind this are interesting, but beyond the scope of this article).

In addition, postMessage is a “fire-and-forget” messaging mechanism with no concept of request and response. If you want to use a request/response mechanism (and in my experience most application architectures eventually force you to do so), you’ll have to figure it out yourself. That’s why I wrote Comlink, a library that uses the RPC protocol at the bottom to help the main thread and Worker access each other’s objects. With Comlink, you don’t have to worry about postMessage at all. The only thing to note is that, due to the asynchronous nature of postMessage, the function does not return a result, but rather a promise. In my opinion, Comlink takes the best of both the Actor model and the shared memory concurrency model and provides it to users.

Comlink is not magic, and in order to use RPC you still need to use postMessage. If your application ends up with rare bottlenecks due to postMessage, you can try to take advantage of the ArrayBuffers can be transferred feature. The transfer of the ArrayBuffer is almost instantaneous and completes the transfer of ownership: the sender’s JavaScript scope loses access to the data in the process. I used this tip when I experimented with running a physics simulation of a WebVR application outside of the main thread.

Concurrency model #2: Shared memory

As I mentioned earlier, traditional threading is based on shared memory. This approach is not feasible in JavaScript because almost all JavaScript apis are designed to assume no concurrent access to objects. Changing this now would either break the Web or cause significant performance losses due to the current necessity of synchronization. Instead, the concept of shared memory is currently limited to a proprietary type: SharedArrayBuffer (or SAB for short).

A SAB, like an ArrayBuffer, is a linear block of memory that can be manipulated via a Typed Array or DataView. If the SAB is sent via postMessage, the other end will not receive a copy of the data, but a handle to the exact same block of memory. Any changes made on one thread are visible to all other threads. To allow you to create your own mutex and other concurrent data structures, Atomics provides various types of tools to implement atomic operations and thread-safe waiting mechanisms.

SAB’s shortcomings are manifold. First and most important, an SAB is just a piece of memory. SAB is a very low-level primitive that provides high flexibility and many capabilities at the cost of increased engineering and maintenance complexity. Also, you can’t handle JavaScript objects and arrays the way you’re used to. It’s just a string of bytes.

To improve the efficiency of this work, I experimentally wrote a library called buffer-backed- Object. It synthesizes JavaScript objects and persists the value of the object to an underlying buffer. In addition, WebAssembly leverages Worker and SharedArrayBuffer to support the threading model in C++ or other languages. WebAssembly currently offers the best solution for shared-memory concurrency, but it also requires you to give up many of the benefits (and comfort) of JavaScript and switch to another language, which often yields more binary data.

Case study: PROXX

In 2019, my team and I released PROXX, a Web-based minesweeper game specifically for feature machines. Feature phones have low resolution, usually no touch interface, poor CPU performance, and no so-so GPU. Despite these limitations, these feature phones are popular because they are ridiculously cheap and have a full-featured Web browser. Thanks to the popularity of feature phones, the mobile Internet has become available to those who could not afford it before.

To make sure the game runs smoothly on these features, we used an Actor-like architecture. The main thread is responsible for rendering the DOM (via Preact and, if available, WebGL) and capturing UI events. The state of the entire application and the game logic run in a Worker, which will confirm whether you step on the thunder, if not, how it should be displayed on the game interface. The game logic even sends intermediate results to the UI thread to continuously provide visual updates to the user.

Other benefits

I talked about the importance of fluency and sensitivity, and how these goals can be more easily achieved with workers. Another extrinsic benefit is that Web workers can help your applications consume less device power. By using more CPU cores in parallel, cpus use less “high performance” mode, resulting in lower power consumption overall. David Rousset from Microsoft explores the power consumption of Web applications.

Using Web Worker

If you’ve read this far, hopefully you have a better understanding of why workers are so useful. Now the next obvious question is how to use it.

At present, workers have not been widely used, so there are not many practices and architectures around workers. It can be difficult to determine in advance which parts of the code are worth migrating into the Worker. I don’t advocate using a particular architecture over others, but I’d like to share with you my experience of gradually using Worker in this way:

Most people have used modules to build applications because most packers rely on modules to perform packaging and code splitting. The main technique for building applications using Web Workers is a strict separation of UI-related and purely computational code. This reduces the number of modules that must exist in the main thread (such as those that invoke the DOM API) and you can do these tasks in the Worker instead.

Also, rely on synchronization as little as possible in order to follow up with asynchronous modes such as callbacks and async/await. If this is done, you can try using Comlink to migrate modules from the main thread to the Worker and see if this improves performance.

It may be a bit tricky for existing projects to use Worker. Take the time to scrutinize the API parts of your code that rely on DOM manipulation or can only be called from the main thread. If possible, remove these dependencies by refactoring and use the model I presented above asymptotically.

In either case, a key point is to ensure that the impact of the _ off-main-thread architecture _ is measurable. Don’t assume (or estimate) whether using Worker will be faster or slower. Browsers sometimes work in puzzling ways, so many optimizations can backfire. It’s important to work out numbers to help you make an informed decision!

Web Workers and Packagers (Bundler)

Most modern Web development environments use packers to significantly improve load performance. The packer packs multiple JavaScript modules into a single file. However, with Worker, we need to keep the file independent due to its constructor requirements. I find that many people will separate and encode the Worker code into Data URL or Blob URL, instead of choosing to work on the packer to fulfill the requirements. Both Data and Blob urls cause major problems: Data urls don’t work at all in Safari, and Blob urls do, but they don’t have the concepts of origin and path, which means path parsing and retrieval won’t work. This is another barrier to working with workers, but the major packagers have recently improved their handling of workers:

  • Webpack: For Webpack V4, the worker-loader plug-in enables Webpack to understand workers. Starting with Webpack V5, Webpack can automatically understand Worker constructors and can even share modules between the main thread and Worker without reloading.
  • Rollup: For Rollup, I wrote rollup-plugin-off-main-thread. This plugin makes workers work out of the box
  • ** Parcel** : Parcel is worth noting that both V1 and V2 support Worker out of the box with no additional configuration.

It is common to use ES Modules when developing applications using these packers. This, however, creates new problems.

Web Worker and ES Module

All modern browsers support running JavaScript modules with

In the future

I like the Actor mode. But concurrency in JavaScript is not designed very well. We built a lot of tools and libraries to compensate, but ultimately this is what JavaScript is supposed to do at the language level. Some TC39 engineers are interested in this topic and are trying to figure out how to make JavaScript better support both modes. Several related proposals are currently under evaluation, such as allowing code to be transmitted by postMessage and being able to share objects between threads using a high-level, scheduler-like API (common on Nativa).

None of these proposals have progressed very far in the standardization process, so I won’t spend time here. If you’re curious, you can check out the TC39 proposal to see what the next generation of JavaScript will include.

conclusion

Worker is a key tool for keeping the main thread sensitive and smooth, which it does by preventing long-running code from blocking browser rendering. Due to the inherent asynchronous nature of communication with Worker, using Worker requires some adjustments to the application architecture, but in return, you can more easily support access from devices with huge performance differences.

You should make sure that you use an architecture that is easy to migrate code around so that you can measure the performance impact of a non-mainline architecture. Web Worker design leads to a learning curve, but the most complex parts can be abstracted by libraries like Comlink.


FAQ

There are always some common questions and ideas, so I wanted to pre-empt them and record my answers here.

postMessageNot slow?

My core advice for all performance issues is: Measure first! There is no such thing as fast or slow until you measure it. But in my experience, postMessage is usually “fast enough.” Here’s my rule of thumb: If json.stringify (messagePayload) takes less than 10KB, you don’t have to worry about getting stuck, even on the slowest phones. If postMessage does become a bottleneck in your application, consider the following techniques:

  • Break up your tasks so you can send smaller messages
  • If the message is a state object and only a small part of it changes, send only the changed part and not the whole object
  • If you send a lot of messages, you can try to combine multiple messages into one
  • As a last resort, you can try to convert your information to a digital representation and transfer ArrayBuffers instead of object-based messages

Which of these techniques to use depends on the scenario and can only be found by measuring and isolating bottlenecks.

I want to access the DOM from the Worker

I get a lot of feedback like this. In most cases, however, this simply shifts the problem away. You might be able to effectively create a second main thread, but you’ll run into the same problem, except that it’s in a different thread. In order for the DOM to be safely accessible through multiple threads, you need to add locks, which can slow down DOM operations and potentially damage many existing Web applications.

There are also advantages to the synchronous model. It gives the browser a clear signal when the DOM is available and can be rendered to the screen. In a multi-threaded DOM world, this signal would be lost and we would have to manually handle part of the rendering logic or whatever.

I really don’t like splitting my code into separate files to use Worker

I agree with you. There are some proposals being reviewed in TC39 to be able to inline one module into another without having as many minor issues as Data urls and Blob urls. While there is no satisfactory solution yet, there will certainly be a future iteration of JavaScript to address this issue.


  1. A home automation company that manufactures battery-powered home security cameras ↩
  2. Send regardless ↩