preface
I was looking through the Little Red Book recently when I saw the chapter on Web Workers and realized how interesting it was to link several cache-related concepts together. Finally, I also wrote a full offline application Demo to run debugging.
Browser cache
Traditionally, browser caching is divided into strong caching and negotiated caching, which are both implemented by setting HTTP headers. The similarities and differences between the two have been discussed a lot, I will not repeat them, attached two references.
- Browser caching mechanism, By Aitter
- HTTP negotiated cache vs. strong cache, By Wonyun
This browser cache (which I call the Header cache) has two common drawbacks:
- When there is no network, the application cannot be accessed because the HTML page has to go to the server.
- Cache is not programmable, can not use JS to finely add, delete, change, check the cache.
The application cache
In order to access applications without a network, a new concept called Application Cache is designed in the HTML5 specification. It allows us to do offline applications. However, due to too many flaws in the design of the API, it was ridiculed by many people and was eventually scrapped. The reasons for abandonment can be seen in these discussions:
- Why isn’t App Cache widely used? Does it have any injuries?
- Application Cache is a Douchebag, By Jake Archibald
PS: I used this kind of technology when I graduated, but it was abandoned in just a few years. How fast the technology is iterated!
CacheStorage
CacheStorage was designed to provide fine, programmable control over caching. With it, you can add, delete, change and check caches with JS, and you can view them visually in Chrome DevTools. With a traditional Header cache, you don’t know what the cache is, and you can’t do anything with it. You can only passively modify the URL to get the browser to discard the old cache and use the new resource.
PS: CacheStorage is not only available in Service workers, it is a global API, and you can also access the Caches global variable from the console.
Web Worker
Traditionally, a web page has only two threads: the GUI rendering thread and the JS engine thread. Even if your JS is written in the best way possible, it can only be executed in one process. However, the JS engine thread and the GUI rendering thread are mutually exclusive, so the UI page will block while JS executes. In order to make the UI page still available when performing high-time-consuming JS operations, an independent JS thread must be opened to run these high-time-consuming JS codes, which is the Web Worker.
Web workers have two characteristics:
- It can only serve the newly created pages, and different pages cannot share the same Web Worker.
- When the page is closed, the new Web Worker on the page will also be closed and will not stay in the browser permanently.
PS: There is also a related concept: Shared Worker. However, it is complicated and I have not studied it in depth. Readers who are interested can learn about it and see the difference between Shared Worker and Service Worker.
Service Worker
Finally, the main character of this article. Service workers have something in common with Web workers: they both open up new JS threads in addition to the regular JS engine threads. Differences mainly include the following points:
- A Service Worker is not for a specific page, but for multiple pages. (Based on the same Origin policy)
- The Service Worker is resident in the browser and does not stop even when the page on which it was registered is closed. It’s essentially a background thread, and it doesn’t end until you terminate it, or the browser reclaims it.
- Life cycles, callable apis, and so on are also quite different.
In a word, Service workers are the product of further development of Web workers. For information on how to use the Service Worker, see the resources below.
- With Service Worker and cacheStorage caching and offline development, By Zhang Xinxu
- Use Service Worker to create a PWA offline web application
- Understanding Service workers By Adnan Chowdhury
I also wrote a Service Worker as an offline application Demo for you to debug and observe. Let’s discuss some overlooked aspects of Service workers, using my Demo as an example.
A Service Worker is just a Service Worker
At first, I thought that Service workers were just for offline applications, but later I found out that this was not the case. → The Service Worker is just a JS thread resident in the browser and does nothing by itself. What it can do depends on which API it is used with.
- With Fetch, you can intercept requests and mock data at the browser level.
- Used with Fetch and CacheStorage for offline applications.
- In combination with Push and Notification, it is possible to do message Push like Native APP. For this, please refer to Villainhr’s article: Web Push Technology
- …
If you put these technologies together, add Manifest, etc., and you’re pretty much PWA. In short, the Service Worker is a critical piece of technology that allows us to get closer to the bottom of the browser and do more things.
The idea is that we, as browser developers, acknowledge that we are not better at web development than web developers. And as such, we shouldn’t provide narrow high-level APIs that solve a particular problem using patterns we like, and instead give you access to the guts of the browser and let you do it how you want, in a way that works best for your users.
Reference: developers.google.com/web/fundame…
The fetch event is not triggered by the first access
According to the official Demo, the Service Worker registration code is placed at the end of the HTML. However, when I tried to bring the Service Worker’s registration code to the beginning and console out the timestamp, I noticed something: Even if the Service Worker successfully registers and requests resources again, these resources will not trigger the FETCH request, and the FETCH event will only be triggered if the page is accessed again. Why is that? I later found the answer in the official documentation: if your page loads without a Service Worker, the fetch event will not be triggered by other resource requests it depends on.
The first time you load the demo, even though dog.svg is requested long after the service worker activates, it doesn’t handle the request, and you still see the image of the dog. The default is consistency, if your page loads without a service worker, neither will its subresources. If you load the demo a second time (in other words, refresh the page), it’ll be controlled. Both the page and the image will go through fetch events, and you’ll see a cat instead.
Reference: developers.google.com/web/fundame…
cache.add VS cache.put
Cache. AddAll in install and cache. Put in fetch. What is the difference between add and put? → cache.add = fetch + cache.put
The add() method of the Cache interface takes a URL, retrieves it, and adds the resulting response object to the given cache. The add() method is functionally equivalent to the following:
fetch(url).then(function(response) { if (! response.ok) { throw new TypeError('bad response status'); } return cache.put(url, response); })Copy the code
Reference: developer.mozilla.org/en-US/docs/…
Event. WaitUntil and event. RespondWith
Say first event. WaitUntil
- Can only be used in install or activate events of Service workers.
- It looks like a callback, but even if you don’t use it,The program may also work. If you pass a Promise to it, the Service Worker will only complete install when that Promise resolved; If the Promise rejected fails, the entire Service Worker is discarded. As a result,
cache.addAll
If only one resource fails to obtain, the entire Service Worker will be invalidated.
Besides, the event. RespondWith
- Can only be used in fetch events of Service workers;
- It acts as a callback that will be returned to the browser when the Promise resolved is passed in.
In summary, although events in Event.waitUntil and event.respondWith are inherited from the Event class, they are quite different from common event objects. These methods also exist only in the corresponding events of the Service Worker.
Update of resources
When we used to use strong caching, if the resource needed to be updated, we could just change the URL of the resource and replace it with a new MD5 stamp. How do you handle resource updates if you use Service Worker + CacheStorage + Fetch for offline applications?
-
Sw.js needs to be changed whenever any resource (HTML, JS, Image, even sw.js itself) needs to be updated. With sw.js, the entry point for the entire application becomes sw.js instead of HTML. Whenever a user visits a page, regardless of whether or not you currently hit the cache, the browser will request sw.js and compare the old and new sw.js bytes. If they are different, it will need to be updated. So, as you can see in the Demo, we have a VERSION field that not only represents the VERSION of sw.js itself, but also represents the VERSION of the entire application.
-
Do not try to change the name of sw.js (for example, to sw_v2.js) to trigger a browser update, because the HTML itself will be cached by sw.js, and the cached HTML will always point to sw.js, so that the browser will not know about sw_v2.js updates. Although you can use the Service Worker to make a PWA offline web application and other methods to determine the status of HTML updates as mentioned above, it is more complicated and not recommended.
you may consider giving each version of your service worker a unique URL. Don’t do this! This is usually bad practice for service workers, just update the script at its current location.
Reference: developers.google.com/web/fundame…
-
Each sw.js update creates a new cache space based on the VERSION field and caches new resources into it. After the old sw.js page is closed, the new sw.js will be activated and the old cache space will be deleted in the Activate event. In this way, sw.js can be updated without error when multiple web pages are open at the same time, and redundant caches can be deleted in time.
Double cache
As mentioned above, when a new sw.js install is installed, all the resources in addAll will be fetched again, regardless of whether the resources in addAll need to be updated. This obviously violates the principle of Web incremental download. → Combine strong cache and Service Worker to make a double cache. The Service Worker comes after the strong cache. For example, if there are two strong caches a_v1.js and b_v1.js, now a is unchanged, b should change to b_v2.js, change addAll and VERSION of sw.js. When the new sw.js install is installed, addAll will fetch a_v1.js, but the browser finds that a_v1.js is a strong cache and will not initiate network requests at all, only b_v2.js will initiate network requests. Specific can debug my Demo to see the phenomenon.
There are two things to be said about this method.
- Need to be in
cache.addAll
To specify the version number of the resource, just as in HTML. Because after using the Service Worker, HTML is just the entry point to load the resource, the function to determine if the resource has changed has been moved to sw.js.
return cache.addAll([ './', 'getList', 'img/avatar_v1.jpg', 'js/index_v2.js', 'js/jquery_v1.js' ]);Copy the code
- The article mentioned above: Using Service Worker to make a PWA offline web application also mentioned the practice of multiple caching, but the author believes that the browser will read the Service Worker first, and will read the strong cache only if there is no Service Worker, which is not consistent with my Demo practice results.
conclusion
I’m almost done here. There are still many things I don’t know about Service workers. A series of emerging apis around Service workers represent a better Web experience and the future of the Web, so more attention should be paid to learning in the future.