Code caching Guidelines for JavaScript developers

Original link:
V8. Dev/blog/code – c…

Code caching (also known as bytecode caching) is a very important optimization tool in browsers. By caching “parse + compile” results, it can reduce the startup time of frequently visited websites. Most major browsers also implement code caching in some form, and Chrome is no exception. We’ve written articles and given talks on how Chrome and V8 cache compiled code, so you can check them out.

Leszek Swirski has several tips for JS developers looking to make the most of code caching to improve their website launch efficiency. These tips focus on the code caching implementation in Chrome/V8, and most of the same principles apply to other browser implementations as well. Also has a high reference value, I hope to inspire you.

Code Cache Overview

While there have been many blogs and topics that have covered the details of code caching implementation, it’s worth explaining briefly how code caching works. Chrome provides two levels of caching for V8-compiled code (both classic and module scripts) : a low-cost in-memory Cache, known as the Isolate Cache, maintained by V8, and a full serialized hard disk Cache.

Isolation buffer to compile scripts for the same V8 quarantine operation (namely the same process, simply be “navigation to the same Tab of the same page”), isolation buffer at the expense of potential at the expense of low shot across processes and cache, for as soon as possible and small to use the available data, in this sense, Quarantining the cache is a “best effort.”

When V8 compiles a script, the compiled bytecode is stored in a hashtable (hashtable, on V8’s heap) with the source code of the script as the key.
When Chrome asks V8 to compile another script, V8 first checks in the hash table to see if the script’s source code matches the bytecode, and returns the existing bytecode if a match is found.

Quarantined caching is fast and efficient, with a real-world hit rate of up to 80 percent tested so far.

Hard disk caches are managed by Chrome (Blink Engine to be exact), and isolated caches cannot share code between processes or across Multiple Chrome sessions, but hard disk caches fill this gap. The hard disk cache leverages the existing HTTP resource cache, which manages the cache received from the Web and the data that is about to expire.

When a JS file is requested (i.e., run cold), Chrome downloads it and hands it over to V8 for compilation, while the file is stored in the browser’s hard disk cache.
When the JS file is requested a second time (i.e., warm run), Chrome extracts the file from the browser cache and gives it to V8 again for compilation. But this time the compiled code is serialized and attached as metadata to the cached script file.
When the JS file is requested a third time, Chrome extracts both the file and the file’s metadata from the browser cache and hands both to V8. V8 deserializes the metadata to skip the compilation process.

The summary is as follows: Code caches can be divided into cold runs, warm runs, which occur in memory caches, and hot runs, which occur in hard disk caches.

With that in mind, we can offer a few suggestions to improve your site’s utilization of code caching.

Tip 1: Do nothing

Ideally, the best thing you can do as a JS developer is to “do nothing” in order to improve code caching. This actually means two things: “forced to do nothing” and “actively choosing to do nothing”.

Code caching is ultimately an implementation detail of the browser, an optimization based on heuristic trade-offs between data and space, and the implementation and heuristics can change frequently. As V8 engineers, we do our best to make these heuristics applicable to every developer at every stage of Web development, and after a few releases, over-optimizing the details of the existing code cache implementation can cause frustration. In addition, other JavaScript engines may use different heuristics in their code caching implementations. So, in many ways, our best advice for getting cached code is the same as our best advice for writing JS code: write clean, language-friendly code, and we’ll try to optimize the code cache for you.

In addition to being “forced to do nothing,” you should also try to actively choose to do nothing, and any form of caching is inherently dependent on the same thing. Therefore, “choosing to do nothing” is the best way to allow cached data to remain cached. Here are some ways you can actively choose to do nothing.

Don’t change the code

This may seem obvious, but it’s worth discussing — every time you add a new line of code, the new code hasn’t been cached yet. Each time a browser requests a script URL over HTTP, it can contain the data returned from the previous request, and if the server knows that the file has Not changed, it can return a 304 Not Modified response, keeping the code cache hot. Otherwise, the 200 OK response updates the cache resource, clears the code cache, and restores the cache to a cold running state.

The server always pushes you the latest code changes immediately when you want to measure the impact of a change. But for caching, the best strategy is to leave the code unchanged or update the code as little as possible. Consider limiting the maximum number of live deployments per week to x, depending on whether you choose to cache code first or update code first.

Don’t change the URL

The code cache is (currently) associated with the URL of the script for easy lookup without having to read the actual contents of the script. This means that changing the script URL (including query parameters) creates a new resource entry in the resource cache, along with a new cold cache entry.

This can of course be used to force a cache clean, but it may not be useful at a future time when we decide to associate the cache with the text of the source file instead of the URL of the source file.

Don’t change executive behavior

One approach we’ve recently used to optimize our code caching implementation is to serialize compiled code only after it finishes executing. This is done to try to catch functions that compile lazily, which are compiled only during execution, rather than during initial compilation.

This optimization works best when the script executes the same code, or at least the same functions, each time it is executed. Problems can arise when you have run-time dependent requirements such as A/B testing:

if (Math.random() > 0.5) {
  A();
} else {
  B();
}Copy the code

In the above example, only one of A() and B() will be compiled and executed on the warm run and into the code cache, but both can be executed on subsequent runs. Therefore, it is better to keep execution as deterministic as possible to keep execution on the cache path.

Tip # 2: Do something

Of course, the “do nothing” advice above, active or passive, is not very satisfying. Beyond that, given our current heuristics and implementations, something can be done. Note, however, that because heuristics and implementations change, recommendations may change as well, and there is no substitute for analysis.

Separate the library from the use code

Code is cached coarse-grained in each script, which means that changes to any part of the script can break the entire script cache. If you have both stable code and frequently changing code (such as libraries and business logic) in a script, changes in business logic code can break the cache of library code.

Instead, we can separate the library code into separate scripts and reference the library independently. In this way, the library code can be cached only once and remain cached as the business logic code changes.

If the script library is shared between pages, this has an additional benefit: because the code cache is attached to the script, the library’s code can also be shared between pages.

Merge library files into the code that uses them

The code is cached at the end of each script execution, meaning that a script’s code cache contains functions in the compiled code when the script is finished executing. This has two important implications for library code:

The code cache will not contain functions from earlier scripts.
The code cache does not contain delayed-compilation functions for subsequent script calls.

In particular, if the library consists entirely of functions that are compiled lazily, these functions will not be cached even if they are called later.

One solution to this situation is to combine the library files and their dependent files into a single script file so that the code cache can “see” which parts of the library are being used. Unfortunately, this would run counter to the previous tip, but there is no silver bullet.

In general, we don’t recommend merging all JS script files into one giant file, but splitting them into smaller scripts is often more beneficial for situations other than code caching (multiple network requests, stream compilation, page interaction, etc.).

Using the IIFE

Compiled functions are added to the code cache only when the script is finished executing, so there are many kinds of functions that are not cached, even though they are executed at a later time. Event handlers (even onload), promise chains, unused library functions, and other lazy compiled functions that are not called until the closing </script> tag is executed, all of which remain deferred and are not cached.

One way to force these functions into the cache is to force them to be compiled, and we usually use IIFE to do that. IIFE (Immediate-Invoked function Expressions) is a design pattern that invokes the function as soon as it is created.

(function foo() {
  / /...}) ();Copy the code

Because IIFE is called immediately, most JavaScript engines try to probe IIFE and compile it immediately to avoid the cost of delay after full compilation. There are various exploratory ways to detect IIFE expressions early before functions are parsed, most commonly through the open parenthesis (before the function keyword).

Since this exploratory approach was applied early on, the function is compiled even if it is not actually executed immediately:

const foo = function() {
  // Lazily skipped
};
const bar = (function() {
  // Eagerly compiled
});Copy the code

This means that you can force a function into the cache by wrapping it in parentheses. However, if used incorrectly, it can have an impact on page launch time, which is often a bit of an overuse of exploration. Therefore, this is not recommended unless it is really necessary.

Group small files together

Chrome has a limit on the minimum size of the code cache, currently 1KB. This means that very small files cannot be cached at all, because we think the cost of caching small files far outweighs the benefit.

If your site contains many small script files, overhead calculations may not work in the same way. Consider merging small files into files that exceed the minimum code size limit and use conventional means to reap the benefits of reduced overhead.

Avoid using inline scripts

Inline scripts in HTML are not associated with external source files and therefore cannot be cached by the above mechanism. Chrome tries to do this by attaching them to caches of HTML document resources, but these caches depend on the stability of the entire HTML document and cannot be shared between pages.

Therefore, for important scripts that need to be cached, avoid inlining them in HTML and refer to them as external files instead.

Use Service Worker caching

A Service Worker is a mechanism used in a page to intercept network requests for resources. In particular, it can build local resource caches and provide cached resources when you request them. This feature is especially useful when building offline applications, such as PWA.

A typical example is a website that uses a Service Worker and registers it in the main script:

// main.mjs
navigator.serviceWorker.register('/sw.js');Copy the code

Here are the handlers for the Service Worker to add the install event (to create the cache) and the fetch event (to serve the resource in the cache) :

// sw.js
self.addEventListener('install', (event) => {
  async function buildCache() {
    const cache = await caches.open(cacheName);
    return cache.addAll([
      '/main.css'.'/main.mjs'.'/offline.html',]); } event.waitUntil(buildCache()); }); self.addEventListener('fetch', (event) => {
  async function cachedFetch(event) {
    const cache = await caches.open(cacheName);
    let response = await cache.match(event.request);
    if (response) return response;
    response = await fetch(event.request);
    cache.put(event.request, response.clone());
    return response;
  }
  event.respondWith(cachedFetch(event));
});Copy the code

These caches can contain cached JS resources. However, because we expect the Service Worker cache to be used primarily for PWA applications, it has a slightly different heuristic than Chrome’s “automatic” cache. First, when JS resources are added to the cache, they immediately create a code cache, which means that the code cache is already available for the second load (not just the third load like a normal cache). Second, we generated a “full” code cache for these scripts, no longer delaying the compilation of functions, but instead compiling all scripts and putting them in the cache. This has the advantage of fast and predictable performance, with no execution order dependence, but at the expense of increased memory usage. Note that this heuristic applies only to Service Worker caches and not to other uses of the Cache API. In fact, the current Cache API does not perform code caching when used outside of a Service Worker.

Tracking information

None of the above is guaranteed to speed up your Web App. Unfortunately, code caching information is also not currently exposed in DevTool, so the safest way to find out what scripts are cached in your Web App is to use the slightly lower-level Chrome ://tracing.

Chrome :// Tracing traces a period of time, which generates visual tracing results as follows:

Chrome :// Tracing records the behavior of the entire browser, including other tabs, Windows, and extensions. So we can get the best trace information when the extension is disabled and all other tabs are closed.

# Start a new Chrome browser session with a clean user profile and extensions disabled
google-chrome --user-data-dir="$(mktemp -d)" --disable-extensionsCopy the code

When collecting trace information, you need to select the categories you want to track. In most cases, you can simply select the Web Developer category, or you can manually select the category, and the important category the code tracks is V8.

When you have finished recording a v8 trace, look up the V8.com pile section (or you can enter it by searching for v8.compile in the search box on the UI). The compiled files are listed here, along with the compiled metadata.

When a script runs cold, there is no code cache information, which means that the script does not participate in generating or consuming cached data.

When the script is warm up, each script has two v8.compile entries: one to indicate that it was actually compiled and the other to indicate that a cache was generated (after execution). You can tell by whether it has two metadata fields: cacheProduceOptions and producedCacheSize.

When the script is hot running, you can see a V8.com pile entry for consuming the cache with two metadata fields, cacheConsumeOptions and consumedCacheSize, all sizes in bytes.

conclusion

For most developers, code caching should be “don’t worry about me, cache will take care of itself”. When nothing changes to the code, the code cache should work just as well as any other type of cache, and work through a series of heuristics after iteration. However, code caches also contain behaviors for developers to use, limitations to avoid, and the Chrome :// Tracing tool for analytics, which can help us tune and optimize the use of caching in Web apps.