Idle Until Urgent
The translator’s note: Familiar optimization strategies have been talked about for years, and there are many articles on the web that use Chrome performance analysis tools to find bottlenecks and target optimization, but very few of them think about optimization in terms of runtime scheduling strategy. As we only knew before using setTimeout for throttling and debounce. So when I came across this article, I had a __ aha __ feeling that we could schedule at such a fine granularity.
Original text: philipwalton.com/articles/id…
A few weeks ago, I was looking at some performance metrics for my site. Specifically, I wanted to see how I performed on our latest performance standard, First Input Delay (FID). Since my site is just a blog (and doesn’t run a lot of JavaScript), I’m hoping I’ll see pretty good results.
Input delays of less than 100ms are generally seen by users as immediate responses, so our recommended performance goal (and the number I’d like to see in my analysis) is FID <100ms for 99% of page loads.
To my surprise, my site had an FID of 254 milliseconds in the 99th percentile. It wasn’t scary, but my perfectionist nature kept me going. Well, I have to fix it!
In summary, I need to be able to keep my FID under 100 milliseconds in the 99th percentile without removing any of my site’s features. But I’m sure you readers will be more interested in the following information:
- How did I diagnose the problem?
- I used _ what_ specific strategies and techniques to solve the problem.
For the second point above, when I was trying to solve my problem, I stumbled upon a very interesting performance strategy that I wanted to share (which is the main reason FOR writing this article).
I call this strategy idle-until-urgent.
I have a performance problem
First input lag (FID) is a metric that is used to measure the user to interact with your web site for the first time the time (for blogs like mine, is most likely to click on the link) and able to respond to the interactive time (for my blog click interaction, is the request to load the next page).
The reason for the possible delay here is that the main thread of the browser is busy doing something else (usually executing JavaScript code). Therefore, to diagnose higher-than-expected FID, the first thing you should do is to enable performance tracking for your site when the page loads (with CPU and network restrictions enabled), and then look for various tasks on the main thread that take a long time to perform. Once you’ve identified these long tasks, you can try breaking them down into smaller tasks.
Here’s what I found when I tracked the site’s performance:
A JavaScript performance tracker for loading my site (with network/CPU limits enabled).
Note that when the main script bundle is executed, it takes 233 milliseconds to run as a single task.
It takes 233 milliseconds to execute my site’s main bundle.
Some of this code is Webpack Boilerplate and Babel Polyfill, but most of it comes from the main() entry function in my script, which itself takes 183 milliseconds to complete:
It takes 183 milliseconds to execute my site’s main() entry function.
However, I’m not doing anything weird in my main() function. I just initialize my UI component in the function and run the analysis:
const main = (a)= > {
drawer.init();
contentLoader.init();
breakpoints.init();
alerts.init();
analytics.init();
};
Copy the code
So what took so long to run?
Well, if you look at the end of this flame diagram, you won’t see any function that obviously takes up most of the time when it’s executed. Most individual functions will run in less than 1 millisecond, but when you add them all up, running them in a single synchronous call stack can take over 100 milliseconds.
This is killer JavaScript.
The problem is that all of these functions are running as part of a single task, so the browser must wait until the task is complete before it can respond to the user’s interaction. The obvious solution is to break up the code into multiple tasks. But this is easier said than done.
At first glance, it might seem obvious that the solution is to prioritize each component in the main() function (they are actually already in priority order), initialize the highest-priority component as soon as possible, and then defer initialization of the others to a later task.
While this approach may help some people, it is not a universal solution that everyone can implement and does not scale well to a very large site. Here’s why:
- Delaying UI component initialization is only useful if the component has not yet been rendered. If it is already rendered, then delaying the initialization of the component runs the risk of the component not being ready when the user interacts with it.
- In many cases, all UI components are equally important or depend on each other, so they all need to be initialized at the same time.
- Sometimes individual components take long enough to initialize, at which point they block the main thread even if they are only running in their own task.
The reality is that initializing each component in your own task is often inefficient and often impossible. We usually need to break the task down into each component that is initialized.
Greedy component
A perfect example of a component that really needs to decompose its initialization code can be seen by scaling this performance trace result further. In the middle of main(), you’ll see one of my components using the Intl.dateTimeFormat API:
createIntl.DateTimeFormat
The example took 13.47ms!
It takes 13.47 milliseconds to create this object!
The problem is that the intl.dateTimeFormat instance, although created in the component’s constructor, is not actually referenced until it is used by other components to format dates. However, this component does not know when it will be referenced, so it can only play it safe and instantiate the int.dateTimeFormat object immediately.
But is this really the right code evaluation strategy? If not, what is the right one?
Code evaluation strategy
When choosing an evaluation strategy for potentially costly code execution, most developers choose one of the following:
- Eager Evaluation early: You can run expensive code immediately.
- Lazy evaluation: You wait until another part of the program needs the results of this expensive code before you run it.
These are probably the two most popular evaluation strategies, but after refactoring my site, I now think these are probably your worst options.
Disadvantages of early evaluation
The performance issues on my site are a good example of one of the disadvantages of early evaluation, which is that if a user tries to interact with your page while the code is being evaluated, the browser must wait until the code is finished evaluating before responding.
This is especially problematic if your page appears ready to respond to user input, but is not. Users will think your pages are slow or even completely broken.
The more code you pre-evaluate, the longer it will take for your pages to reach the point where they can interact.
Disadvantages of lazy evaluation
If running all the code at once is bad, the obvious solution is to wait until you actually need it. This way you don’t run the code unnecessarily, especially if the user never actually needs it.
Of course, the problem with waiting until the user needs the results of the code is that the user input is bound to get clogged with your expensive code.
For some things, such as loading other content from the network, it makes sense to defer execution until the user requests it. But for most of the code you’re running (such as reading data from localStorage, processing large data sets, and so on), you want to start executing it before the user interaction that needs it begins.
Other options
You can also choose one of the other evaluation strategies between early and lazy evaluation. I’m not sure if the following two strategies have official names, but I’ll call them delayed evaluation and idle evaluation:
- Deferred Evaluation: Use similar to
setTimeout
Such methods schedule code execution in a future task - Idle Evaluation: A delayed evaluation that you can use apis like requestIdleCallback to schedule code execution.
Both options are generally better than early or lazy evaluation because they are less likely to result in a single long task that blocks input. This is because while the browser cannot interrupt any task in response to user input (doing so would most likely crash the page), it can run tasks between queues of scheduled tasks, as most browsers do when tasks are triggered by user input. This is called input priority.
In other words: If you can make sure that all your code runs in short, distinct tasks (preferably less than 50 milliseconds), your code will never clog up user input.
Important! While browsers can execute input callbacks before queued tasks, they cannot run input callbacks before queued microtasks. Because Promises and Async functions run as microtasks, converting synchronous code to Promise based code will not prevent it from blocking user input!
If you’re not familiar with the difference between tasks and microtasks, I strongly encourage you to watch my colleague Jake’s excellent talk on the loop of events.
Given what I just said, I can refactor my main() function to break my initialization code into separate tasks using setTimeout() and requestIdleCallback() :
const main = (a)= > {
setTimeout((a)= > drawer.init(), 0);
setTimeout((a)= > contentLoader.init(), 0);
setTimeout((a)= > breakpoints.init(), 0);
setTimeout((a)= > alerts.init(), 0);
requestIdleCallback((a)= > analytics.init());
};
main();;
Copy the code
However, while this is a bit better than before (many small tasks vs one long task), as I explained above, it may still not be good enough. For example, if I delay the initialization of my UI components (specifically contentLoaders and drawers), they will be less likely to block user input, but they also run the risk of not being ready when the user tries to interact with them!
While it might be a good idea to defer my Analytics using requestIdleCallback(), any interactions I care about until the next idle period will be missed. If there is no free time before the user leaves the page, this callback code may never run!
So if all evaluation strategies have drawbacks, which one should you choose?
Idle Until Urgent
After spending a lot of time thinking about this, I realized that the evaluation strategy I really wanted was to have my code initially deferred to idle hours, but ready to run when needed. In other words: idle-until-urgent.
The idle-until-Urgent policy avoids most of the drawbacks I described in the previous section. At worst, it has exactly the same performance characteristics as deferred evaluation, and at best it doesn’t clog interactivity at all because code execution occurs during idle periods.
I should also mention that this strategy applies to both a single task (calculating values while idle) and multiple tasks (an ordered queue of tasks that can be run when idle). I’ll explain the single-task (free value) form first, because it’s easier to understand.
Free value
Above, I showed that initializing an int.dateTimeFormat object can be very expensive, so it is best to initialize it during idle time if the instance is not immediately needed. Of course, you want it to be there once it’s needed, so this is a perfect candidate for an idle-until-Urgent evaluation strategy.
Consider the following example of a simplified component that we want to refactor to use this new policy:
class MyComponent {
constructor() {
addEventListener('click', () = >this.handleUserClick());
this.formatter = new Intl.DateTimeFormat('en-US', {
timeZone: 'America/Los_Angeles'}); } handleUserClick() {console.log(this.formatter.format(new Date())); }}Copy the code
The MyComponent instance above does two things in its constructor:
- Add event listeners for user interactions.
- create
Intl.DateTimeFormat
Object.
This component perfectly illustrates why you often need to split tasks within a single component (not just at the component level).
In this case, it is important that the event listener runs immediately, but it is not important that an instance of intl.dateTimeFormat is created before the event handler needs it. Of course we don’t want to create the intl.dateTimeFormat object in the event handler because its slow speed will delay the execution of the event.
So, this is how we update this code to use the idle-until-Urgent policy. Note that I’m using the IdleValue helper class, which I’ll explain below:
import {IdleValue} from './path/to/IdleValue.mjs';
class MyComponent {
constructor() {
addEventListener('click', () = >this.handleUserClick());
this.formatter = new IdleValue((a)= > {
return new Intl.DateTimeFormat('en-US', {
timeZone: 'America/Los_Angeles'}); }); } handleUserClick() {console.log(this.formatter.getValue().format(new Date())); }}Copy the code
As you can see, this code doesn’t look too different from previous versions, but instead of assigning this.formatter to a new intl.dateTimeFormat object, I assign this.formatter to an IdleValue object, I passed an initialization function in the object.
The way this IdleValue class works is that it schedules initialization functions that run during the next idle period. If the idle phase occurs before the IdleValue is referenced to the instance, no blocking occurs and the return value can be retrieved immediately upon request. On the other hand, if the value is referenced _ before _ in the next idle period, the scheduled idle callback is canceled and the initialization function is called immediately.
Here’s the gist of how to implement the IdleValue class (note: I also published this code as part of the Idlize package, which contains all the helper classes shown in this article) :
export class IdleValue {
constructor(init) {
this._init = init;
this._value;
this._idleHandle = requestIdleCallback((a)= > {
this._value = this._init();
});
}
getValue() {
if (this._value === undefined) {
cancelIdleCallback(this._idleHandle);
this._value = this._init();
}
return this._value;
}
// ...
}}
Copy the code
Although the introduction of the IdleValue class in the above example does not require many changes, it technically alters the public API (this.formatter versus this.formatter.getValue()).
If you are in a situation where you want to use the IdleValue class but cannot change the public API, you can use the IdleValue class with the ES2015 getter:
class MyComponent {
constructor() {
addEventListener('click', () = >this.handleUserClick());
this._formatter = new IdleValue((a)= > {
return new Intl.DateTimeFormat('en-US', {
timeZone: 'America/Los_Angeles'}); }); } get formatter() {return this._formatter.getValue();
}
// ...
}}
Copy the code
Or, if you don’t mind a bit of abstraction, you can use the defineIdleProperty() helper class (which uses Object.defineProperty() underneath) :
import {defineIdleProperty} from './path/to/defineIdleProperty.mjs';
class MyComponent {
constructor() {
addEventListener('click', () = >this.handleUserClick());
defineIdleProperty(this.'formatter', () = > {return new Intl.DateTimeFormat('en-US', {
timeZone: 'America/Los_Angeles'}); }); }// ...
}}
Copy the code
There’s really no reason not to use this strategy for individual attribute values that can be expensive to compute, especially if you can use it without changing the API!
Although this example uses the intl.dateTimeFormat object, it might also be a good candidate for any of the following operations:
- Handles a large set of values.
- Get the value from localStorage (or cookie).
- Run getComputedStyle(), getBoundingClientRect(), or any other API that might need to recalculate the style or layout on the main thread.
Idle task queue
Of such technology is suitable for its value can use a single function to calculate the independent property, but in some cases, you may not share a single function of logic expression, or, even if technically feasible, you will still want it broken down into several smaller function, because they do not do so, you can get time to block the main thread.
In this case, what you really need is a queue in which you can schedule multiple tasks (functions) to run when the browser is idle. The queue will run the task when possible, and it will pause the task when it needs to go back to the browser (for example, if the user is interacting).
To solve this problem, I built an IdleQueue class, which you can use like this:
import {IdleQueue} from './path/to/IdleQueue.mjs';
const queue = new IdleQueue();
queue.pushTask((a)= > {
// Some expensive function that can run idly...
});
queue.pushTask((a)= > {
// Some other task that depends on the above
// expensive function having already run...
});
Copy the code
Note: Splitting synchronous JavaScript code into separate tasks that can be run asynchronously as part of a task queue is different from code splitting, which is splitting large JavaScript bundles into smaller files (which is also important for improving performance).
As with the idle initialization attribute strategy shown above, idle task queues can also be run immediately when immediate execution results are required (i.e., in “emergency” cases).
Again, this last point is very important: sometimes not just because you need to compute something quickly, but often because you are integrating with synchronized third-party apis, you need to be able to run your tasks synchronously for compatibility.
In a perfect world, all JavaScript apis would be non-blocking, asynchronous, and made up of small pieces of code that could be returned to the main thread at will. But in the real world, due to legacy code bases or integration with third-party libraries beyond our control, we often have no choice but to stay in sync.
As I said earlier, this is one of the great advantages of idle-until-Urgent mode. It can be easily applied to most applications without the need to massively rewrite the architecture.
Guaranteed emergency
I mentioned above that requestIdleCallback() does not guarantee that the callback will run. When discussing requestIdleCallback() with developers, this is the main reason I hear they don’t use it. In many cases, the possibility that your code won’t work is reason enough not to use it — to keep it running safely and keeping it in sync (and, of course, blocking).
A perfect example is analyzing code. The problem with profiling code is that in many cases it needs to be run when the page is unloaded (tracking outbound link clicks, for example), in which case requestIdleCallback() simply doesn’t work as an option because the callback never runs. Because analysis libraries don’t know when their users will call their apis during the page life cycle, they also tend to run all their code safely and synchronously (which is unfortunate, since analyzing code is not critical to the user experience).
But in idle-until-Urgent mode, there is a simple solution. All we need to do is make sure that the queue runs as soon as the page is in a state that might unload quickly.
If you are familiar with the advice I gave in my recent article on the Page Lifecycle API, you will know that before a Page is terminated or discarded, The last reliable callback developers can rely on is the VisiBilityChange event (because the page’s visibilityState is hidden). And since the user cannot interact with the page in the hidden state, this is the best time to run any idle tasks in the queue.
In fact, if we use the IdleQueue class, we can use a simple configuration item passed to the constructor to enable this functionality.
const queue = newIdleQueue ({ensureTasksRun:true});Copy the code
For tasks such as rendering, you do not need to ensure that the task is run before the page is unloaded, but you may want to set this option to true for tasks such as saving user state and sending analysis at the end of the session.
Note: Listening to the VisiBilityChange event should be enough to ensure that the task runs before unloading the page, but due to a bug in Safari, pageHide and VisiBilityChange events don’t always fire when the user closes the TAB, You have to implement a contingency method for Safari. This solution is already implemented for you in the IdleQueue class, but you must know enough about it if you implement it yourself.
Warning! Do not listen for an UNLOAD event in order to run the queue before the page is unloaded. Unload events are unreliable and can hurt performance in some cases. For more details, see my Page Lifecycle API article.
Idle – until – urgent cases
Whenever you need to run potentially expensive code, you should try to break it down into smaller tasks. If you don’t need this code right now, but might need it at some point in the future, it’s a perfect use case for the idle until urgent policy.
In your own code, the first thing I recommend is to look at all of your constructors, and if any of them run potentially time-consuming operations, refactor them to use IdleValue objects instead.
For others, consider adding logic to the IdleQueue if it is necessary for direct user interaction, but not necessarily decisive. Don’t worry, if you need to run the code immediately, you can.
Two specific examples that are particularly suited to this technique (and relevant to most web sites) are persistent application state (for example, using Redux and the like) and web site analysis.
Note: The intent of these use cases is that tasks should run during idle periods, but of course there is no problem if they don’t run immediately. If you need to handle high-priority tasks that are designed to run as quickly as possible (but still require response input), requestIdleCallback() may not solve your problem.
Fortunately, some of my colleagues have come up with new Web platform apis (shouldYield() and the native Scheduling API) that might help you.
Persist application state
Consider a Redux application that stores application state in memory, but also needs to store it in persistent storage, such as localStorage, so it can be reloaded the next time a user accesses the page.
The Debounce technique used by most Redux applications that store state in localStorage looks something like this:
let debounceTimeout;
// Persist state changes to localStorage using a 1000ms debounce.
store.subscribe((a)= > {
// Clear pending writes since there are new changes to save.
clearTimeout(debounceTimeout);
// Schedule the save with a 1000ms timeout (debounce),
// so frequent changes aren't saved unnecessarily.
debounceTimeout = setTimeout((a)= > {
const jsonData = JSON.stringify(store.getState());
localStorage.setItem('redux-data', jsonData);
}, 1000);
});
Copy the code
While using debounce is certainly better than nothing, it’s not a perfect solution. The problem is that you have no guarantee that when the debmentioning function runs it will not block the main thread at a critical time for the user.
It is much better to schedule localStorage writes at idle time. You can convert the above code from the Debounce policy to the idle-until-Urgent policy as follows:
const queue = new IdleQueue({ensureTasksRun: true});
// Persist state changes when the browser is idle, and
// only persist the most recent changes to avoid extra work.
store.subscribe((a)= > {
// Clear pending writes since there are new changes to save.
queue.clearPendingTasks();
// Schedule the save to run when idle.
queue.pushTask((a)= > {
const jsonData = JSON.stringify(store.getState());
localStorage.setItem('redux-data', jsonData);
});
});
Copy the code
Note that this policy is certainly better than using Debounce because it ensures that the state is saved even if the user leaves the page. The example using debounce, however, may fail to write in this case.
Web analytics
Another perfect use case for idle-until-Urgent is profiling code. Here is an example of how to use the IdleQueue class to schedule the dispatch of analysis data to ensure that it will be sent even if the user closes the TAB or navigates the web page before the next idle period.
const queue = new IdleQueue({ensureTasksRun: true});
const signupBtn = document.getElementById('signup');
signupBtn.addEventListener('click', () = > {// Instead of sending the event immediately, add it to the idle queue.
// The idle queue will ensure the event is sent even if the user
// closes the tab or navigates away.
queue.pushTask((a)= > {
ga('send'.'event', {
eventCategory: 'Signup Button'.eventAction: 'click'}); }); });Copy the code
In addition to ensuring that it is executed in an emergency, adding this task to the idle queue ensures that it does not block any other code needed to respond to a user click.
In fact, it’s usually best to run all the profiling code, including initialization code, at your leisure. For apis like analytics. Js that have become libraries for queues, it’s easy to add these commands to our IdleQueue instance.
For example, you can convert the last part of the default analytics. Js installation snippet like this:
ga('create'.'UA-XXXXX-Y'.'auto');
ga('send'.'pageview');
Copy the code
To this:
const queue = new IdleQueue({ensureTasksRun: true});
queue.pushTask((a)= > ga('create'.'UA-XXXXX-Y'.'auto'));
queue.pushTask((a)= > ga('send'.'pageview'));
Copy the code
(You can also create a wrapper for the ga() function that automatically queues commands, which is what I did).
Browser support for requestIdleCallback
As of this writing, only Chrome and Firefox support requestIdleCallback(). While true polyfill is impossible (because only the browser can tell when it’s idle), it’s easy to write a fallback to setTimeout (all the helper classes and methods mentioned in this article use this fallback).
Even in browsers that don’t natively support requestIdleCallback(), a fallback with setTimeout is definitely better than not using this policy, because the browser can still prioritize input before queuing tasks through setTimeout().
How much performance does this actually improve?
I mentioned at the beginning of this article that I came up with this strategy because I was trying to increase the FID value of my site. I try to separate out all the code that runs immediately after the main bundle loads, but I also need to make sure that my site continues to use some third-party library (such as analytics. Js) that only has a synchronization API.
The trace I did before implementing idle-until-Urgent showed that I had a 233ms task with all the initialization code. After implementing the techniques I describe here, you can see that I have several shorter tasks. In fact, the longest one is now only 37 milliseconds!
The JavaScript performance tracker for my site shows many short tasks.
One very important point to emphasize here is that it does the same work as before, but now it’s spread over multiple tasks and runs during idle time.
Since none of the tasks lasted more than 50 milliseconds, none of them affected my interaction time (TTI), which helped my Lighthouse score:
I implemented aidle-until-urgetAnd then the Lighthouse Report.
Finally, since the focus of all this work was on improving my FID, after publishing these changes to production and looking at the results, I was pleased to find that my FID value in the 99th percentile was reduced by 67%!
Code version | FID (p99) | FID (p95) | FID (p50) |
---|---|---|---|
inidle-until-urgentbefore | 254ms | 20ms | 3ms |
inidle-until-urgentafter | 285ms | 16ms | 3ms |
conclusion
In a perfect world, our site would never block the main thread unnecessarily. We both use Web workers to do non-UI work, and we have shouldYield() and the native Scheduling API built into the browser.
But in our current world, our Web developers often have no choice but to run non-UI code on the main thread, which leads to no response.
Hopefully this article has convinced you to shred long-running JavaScript tasks. Also, because idle-until-Urgent can turn an API that looks synchronous into code that actually runs during idle hours, it’s a good solution for the libraries we use widely today.
If you like this article and think others should read it, please share it on Twitter.
The article can be reproduced at will, but please keep this link to the original text. Please send your resume to caijun. HCJ (at)alibaba-inc.com.