NextTick: MutationObserver (Vue) nextTick: MutationObserver (Vue) As I said, I happened to study the rendering mechanism implemented by browsers after processing tasks and microtasks in a discussion on Task and microtask. I was quite excited when I saw this content, because I had never known the style I changed in JS. I was excited to write this article when and how the browser rendered to the interface.

A recent dive into this area reveals that there’s a whole new world out there, and there’s more going on under the familiar concepts of rearrangement, redraw, composite, and compositing. This article will cover the detailed rendering process of the browser.

The Event Loop specification and processing

I mentioned this part in the nexttick explanation at the beginning, but I just read it for myself and did not introduce it in detail as the focus of the article. At that time, I mainly talked about Task and microtask. This is the key to the rendering process, and the timing of the browser’s rendering, so be careful:

Please click the link to see the html5 official specification: HTML5 Event Loop Processing Model

  1. In steps 1 to 5, select a task queue from multiple task queues (browsers often have multiple task queues to prioritize tasks), remove the oldest task from the queue, execute it, and remove it from the queue.
  2. Perform a microTask checkpoint. This checkpoint contains several sub-steps. As long as the MicroTask queue is not empty, this step performs a microtask checkpoint. Perform it. If a microtask is added during microTask execution, the newly added microtask is still executed. (Return to the microtask queue handling step 7 is critical.)
  3. Step 7 Update the rendering. It’s time to update the interface!
    1. From step 7.1 to step 7.4, determine whether the current document needs to be rendered. According to the official specification, the browser will judge whether the document will benefit from UI Render, because it only needs to keep the refresh rate of 60Hz, and each round of event loop is very fast. So it’s not necessary to Render the UI every round of loop, but at approximately 16ms. At the same time, for some of the more sluggish pages that can no longer be guaranteed at 60Hz, performing interface rendering at this point will make things worse, so the browser may downgrade the frequency at which document benefits to (say) 30Hz.
    2. Run the resize steps. If the browser resize, this will trigger the ‘resize’ event on the Window.
    3. Run the scroll steps. First of all, whenever we scroll over some target (target could be some scrollable element or document), The browser stores the target in pending Scroll Event Targets on the document that the target belongs to. Now, run the Scroll Steps fetches the target from the Pending Scroll Event targets and triggers the scroll event on the target.
    4. Calculates whether a Media Query is triggered
    5. 7.8 and 7.9 Run the CSS animation and trigger animation events such as’ animationStart ‘. If a full screen API such as requestFullscreen() has been executed in a previous task or MicroTask, the full screen operation will be performed.
    6. Run the AnimationFrame callbacks, perform the requestAnimationFrame callback, requestAnimationFrame is executed here!
    7. Perform a callback to IntersectionObserver. You may have used this API in your image lazy loading logic.
    8. Update and render the user interface
  4. Continue back to step 1

Ok, that’s the whole process. The first two steps of Task and Microtask-related processing are not described here. There are three things to watch for in the important third step:

  1. Not every round of event loop will Update the rendering, only when the browser decides that the document needs to Update the rendering. This means that the minimum interval between two UI renders is also 16ms, if you setInterval to 1ms update, it will still be 16ms update.
  2. The resize and Scroll events are triggered in the render flow. Isn’t that amazing? This means that if you want to bind a callback to the Scroll event to perform an animation, then you don’t need to use requestAnimationFrame to throttle it at all. The Scroll event itself is executed before each frame is actually rendered and has its own throttling effect! Of course, business logic, such as lazy loading of scrolling images and infinite loading of scrolling content, rather than animation logic, is throttle.
  3. As described by MDN, W3C, etc. : requestAnimationFrame callbacks are executed before redrawing, and step 7.9 is a guarantee of this logic.
  4. The UI redraw is performed at the end of the Event loop. The redrawing of the page is tightly coupled to the Event loop, and is precisely defined in the Event Loop, unexpectedly. I have never understood when the DOM style will be rendered after MY JS changes it.

With the introduction of page rendering timing, let’s talk about the process of rendering. In addition, after the text is about the browser detailed process, is the implementation, the previous text is about the specification.

Apply colours to a drawing

Edit the structure of the DOM -> calculate the style -> layout -> paint -> Composite

But there’s more to it than that, and we’ll elaborate on the process at a lower level, mainly in the following two images:



Above fromGPU Accelerated Compositing in Chrome



Above fromThe Anatomy of a Frame

This section is based on the Blink and WebKit kernel, but the rearrangement, redraw, composite, and composite layer enhancement aspects are consistent across browsers.

Let’s start with some concepts

  1. The bitmap



    It’s called a bitmap in data structures. If you want to draw an image, what you should do is, obviously, first represent the image as a data structure that a computer can understand: a two-dimensional array, each element of which records the specific color of each pixel in the image. So the browser can use a bitmap to record what it wants to draw in a particular area by filling in pixels with specific subscripts in the array.

  2. Texture Texture is essentially a bitmap in the GPU, stored in GPU Video RAM. You can define what the elements in the bitmap store by yourself. You can define whether to use 3 words to store 256 bits of RGB or 1 bit to store black and white by yourself. But the texture is dedicated to GPU, GPU and CPU are separated, so it needs to have a fixed format for easy compatibility and processing. Therefore, on the one hand, the texture format is relatively fixed, such as R5G6B5, A4R4G4B4 pixel format, on the other hand, GPU has restrictions on the size of texture, such as length/width must be a power of 2, the maximum cannot exceed 2048 or 4096, etc..

  3. Rasterize





    Filling a texture with pixels is not as simple as walking through every element in the bitmap and filling in the color of that pixel. Just like the previous two pictures.The essence of rasterization is coordinate transformation, geometric discretization, and then filling.

    At the same time, Rasterization has basically evolved from the early full-screen Rasterization to the present tile-based Rasterization, that is, instead of rasterizing the entire image, the image is divided into tiles (also translated as tiles, tiles…). After that, each tile is rasterized separately. Once rasterized, fill the texture with pixels and upload the texture to the GPU.

    On the one hand, as mentioned above, the texture size is limited, even if you raster the whole screen, you still need to fill the texture in small pieces, it is better to raster the blocks according to the texture size before filling the texture. The other is to reduce memory footprint (full-screen rasterization means more buffer space to prepare) and reduce overall latency (block rasterization means multi-threaded parallel processing).

    See those cyan rectangles in blue in the picture below? They are tiles.

As you can imagine, a drawing process of the browser is to first draw the content to be drawn, such as text, background, border and so on, into many textures by Rasterize blocks, and then upload the textures to the storage space of the GPU, and the GPU draws the textures to the screen.

The specific process of drawing

We’ll isolate the styling, rearrangement, and so on from how the browser draws.

Let’s start with this classic picture:



The names of some of the nouns in the picture have changed, as shown in taobaofed’s article:Wireless performance optimization: Composite

Render Object

First we have a DOM tree, but the DOM inside the DOM tree is for JS/HTML/CSS and cannot be drawn directly into a page or bitmap. So the browser implements Render Object internally:

Each Render Object corresponds to a DOM node. Render Object implements the method of drawing the carry map of its corresponding DOM node, which is responsible for drawing the visible content of this DOM node, such as background, border, text content and so on. Render Object is also stored in a tree structure.

Since we have implemented the method of drawing each DOM node, can we create a bitmap space, then DFS traverses the new Render Object tree and executes the rendering method of each Render Object to draw the DOM bitmap? Just like “stamping”, the content of each Render Object is covered on the paper one by one (analogous to the bitmap at this time).

No, the browser also has a cascading context. This is what determines the overlay relationship between elements (such as z-index). This makes it possible for elements located earlier in the document flow to overwrite elements located later. The DFS process described above can only mindlessly overwrite the previous element with the last element in the document flow.

Hence the Render Layer.

Render Layer

Of course, the appearance of the Render Layer is not simple because of the cascading context, such as opacity less than 1, mask, etc., which needs to draw the content before performing some unified CSS effects on the drawn content.

Elements that are layered, translucent, etc. (see wireless performance optimization: Composite for details) are promoted from Render Object to Render Layer. A Render Object that is not promoted to the Render Layer is subordinate to the closest Render Layer of its parent element. Of course the root HTML element itself is promoted to the Render Layer.

So now the Render Object tree becomes the Render Layer tree, and each Render Layer contains the Render Object of its own Layer.

In addition:

The children of each RenderLayer are kept into two sorted lists both sorted in ascending order, the negZOrderList containing child layers with negative z-indices (and hence layers that go below the current layer) and the posZOrderList contain child layers with positive z-indices (layers that go above the current layer). The sub-render layers of each Render Layer are stored in two ordered lists in ascending order: NegZOrderList stores the sub-layers of negative Z-Indicices, and posZOrderList stores the sub-layers of positive Z-Indicies. — From gPU-accelerated Compositing

Now the browser rendering engine traverses the Layer tree, visits each RenderLayer, recursively traverses the Layer in negZOrderList, its RenderObject, and recursively traverses the Layer in posZOrderList. You can draw a Layer tree.

The Layer tree determines the order in which the page is drawn, and the RenderObject subordinate to the RenderLayer determines the content of the Layer, All of the RenderLayer and RenderObject together determine what the page will render on the screen.

Issues such as cascading context, translucency, masks and so on are addressed with the Render Layer. So now: open up a bitmap space -> constantly Render the Render Layer, overwrite the lower Layer-> show it to the GPU is completely OK?

Not at all. Also GraphicsLayers and Graphics Context

Graphics Layer(also called Compositing Layer) and Graphics Context

The above procedure takes care of the drawing process. But there’s always animation, video, Canvas, 3D CSS in the browser. This means that the page display changes a lot with these elements, which means that the bitmap changes a lot. At 60 frames per second, redrawing the entire bitmap with every change is a terrible performance overhead.

So browsers try to optimize this process. Graphics Layers and Graphics Context (Compositing Layer)

Elements with CSS3’s 3D transform, animating elements on opacity and transform properties, hardware-accelerated canvas and video, etc., will be promoted to Render Layer in the previous step. Now they are promoted to the Graphics Layer of the synthesis Layer (if you looked at the link I gave you earlier, you might have wondered why these situations are also promoted to the Render Layer, but now you understand that they are being promoted to the Graphics Layer). Each Render Layer belongs to the Graphics Layer closest to its ancestor. Of course the root HTML element itself is promoted to the Graphics Layer.

Render Layer upgrade to Graphics Layer

  • 3D or Perspective transform CSS properties
  • Uses elements that accelerate video decoding
  • Elements that have either a 3D (WebGL) context or an accelerated 2D context
  • Hybrid plug-ins (e.g. Flash)
  • Animation or transition is applied to opacity, Transform, fliter, and backdropfilter (animation or transition must be active, The animation or Transition layer fails before the animation or Transition effect starts or ends.
  • Will-change is set to opacity, transform, top, left, bottom, and right (top, left, etc., should be set with clear positioning attributes, such as relative, etc.)
  • An element that has accelerated CSS filters
  • Element has a sibling element with a lower Z-index that contains a composite layer (in other words, the element is rendered above the composite layer)
  • … . For a detailed list of all cases, see taobao Fed article: Wireless Performance Optimization: Composite

3D Transform, will-change set to opacity, transform, CSS transitions and animations including opacity and transform are three common cases of raising compositing layers.

In addition to the above directly causing the Render Layer to be promoted to Graphics Layer, there is also the following situation where A is promoted implicitly because B is promoted, see this article: GPU Animation: Doing It Right

Each Graphics Layer has a Graphics Context, and the Graphics Context will create a bitmap for the Layer, which means that each Graphics Layer has a bitmap. The Graphics Layer is responsible for rendering Render objects contained in its Render Layer and its descendants into the Graphics. The bitmap is then handed to the GPU as a texture. So now the GPU receives the texture of the Graphics Layer of the HTML element, and it may also receive the texture of some element that has been promoted to the Graphics Layer due to attributes like a 3D Transform.

At present, THE GPU needs to composite multi-layer textures. Meanwhile, the GPU can specify different synthesis parameters for each texture layer during texture synthesis, so that the texture can be synthesized after transform, mask, opacity and other operations. Moreover, the GPU is accelerated by the underlying hardware for this process. Very good performance. Finally, the texture is synthesized into one piece of content and drawn onto the screen.

Therefore, animation processing will be very efficient when elements have CSS animation or CSS transition with properties such as transform and opacity. These properties do not need to be redrawn in animation, but only need to be recomposed.

The process of layering and merging above can be described in a diagram:

The concrete implementation of drawing

The system structure

process

Both blink and WebKit engines use two internal processes to handle core tasks such as JS execution and page rendering.

  • The main Renderer process, one per TAB. Responsible for executing JS and page rendering. There are three threads: Compositor Thread, Tile Worker, and Main Thread, which are described later.
  • The GPU process is shared by the entire browser. It is mainly responsible for uploading the tile bitmap drawn in the Renderer process to the GPU as a texture, and calling the relevant methods of the GPU to draw the texture to the screen (paint is used in general articles about browser rendering engine. The process in the CPU should be called “the process that deals with the GPU” instead of “MDZZ” because I don’t understand the GPU. A GPU process has only one Thread: GPU Thread.

Three threads of the Renderer process

  • Compositor Thread This Thread is responsible for receiving both Vsync signals from the browser (horizontal synchronization means drawing a screen line, vertical synchronization means drawing from the top to the bottom of the screen, indicating the end of the previous frame and the start of a new frame) and user interactions from the OS. Such as scrolling, typing, clicking, mouse movement and so on. If possible, the Compositor Thread is directly responsible for processing these inputs, then translates to layer displacement and processing, and commits the new frames directly to the GPU Thread to output the new page. Otherwise, if you register callbacks for scrolling, input events, etc., or if there are animations in the current page, the Compositor Thread wakes up the Main Thread to execute JS, redraw, rearrange, etc., and produce new textures. The Compositor Thread then commits the relevant textures to the GPU Thread to complete the output.
  • Main Thread



    The Main bar in chrome DevTools Timeline shows the tasks completed by the Main Thread: Execute a section of JS, Recalculate Style, Update Layer Tree, Paint, Composite Layers, etc.
  • The Compositor Tile Worker(s) may have one or more threads, such as two or four for Chrome on PC and one or two for Android and Safari. Is created by the Compositor Thread to handle tile Rasterization.

As you can see, the Compositor Thread is a very core thing and the latter two threads are mainly controlled by it. Meanwhile, user input is directly entered into the Compositor Thread. On the one hand, user input can be directly processed and responded to scenarios that do not require JS execution, CSS animation, redrawing, etc., while the Main Thread has a very complex task flow. This allows the browser to quickly respond to user input by scrolling, typing, and so on, without entering the main thread at all. There’s also a very important point here, which I’ll talk about later. Furthermore, even if you register UI interaction callbacks, enter the main Thread, or the main Thread is stuck, the Compositor Thread is still directly responsible for the next frame output to the page because the Compositor Thread is blocked outside, so even though your main Thread may be performing high-throughput tasks, More than 16ms, but the browser still responds when you scroll the page (except for special tasks such as synchronous AJAX), so if you have an animation that is relatively stuck (animation pre-computed or redrawn over 16ms per frame), but you scroll the page very smoothly, That is, animation card and scrolling card (just give you a demo to try).

The specific process

In the DevTools Timeline, you can see the following process:



That is, JS execution triggers redraw and rearrange operations. Here’s a look at the process behind it, as shown below:



The Compositor Thread informs the GPU Thread that the texture can be drawn to the screen according to the specified parameters. The Compositor Thread informs the GPU Thread that the texture can be drawn to the screen according to the specified parameters.

Overall process:

  1. Vsync receives the Vsync signal and the frame starts
  2. User UI interaction Input received by the previous Compositor Thread is passed to the main Thread at this point, triggering callbacks to related events.

    All input event handlers (touchmove, scroll, click) should fire first, once per frame, but that’s not necessarily the case; a scheduler makes best-effort attempts, the success of which varies between Operating Systems.

    This means that while the Compositor Thread can receive multiple inputs from the OS in 16ms, the trigger events and incoming to the main Thread are perceived by JS only once per frame or even less than once per frame. This means that events like TouchMove and Mousemove can only be executed once per frame at the soonest, so it has a throttling effect relative to animation! If your main thread has a bit of an animation or something, the event firing rate is very likely to be less than 16ms. I mentioned scroll and resize at the beginning of my presentation about rendering timing because we’re in the same rotation as rendering, so the fastest we can do it is once per frame, but now it’s not just scroll and resize! This is also true for events such as TouchMove and Mousemove due to the mechanics of the Compositor Thread! Check jsfiddle for details. You can see that mousemove callback and requestAnimationFrame callback are called exactly the same frequency. Mousemove is called exactly the same number of times as RAF. There will never be a situation where Mousemove executes twice and rAF has not executed once. The other two execution intervals are between 14 and 20 milliseconds, mainly because the interval of frames is not accurate to 16.666 milliseconds, and basically fluctuates between 14ms and 20ms. You can open the timeline to observe. Another strange thing is that every time the mouse moves from Devtool back to the page area, mousemove is triggered twice very quickly (sometimes with less than 5ms interval), although it still follows Raf after each mousemove, which means two frames are triggered very quickly.

  3. RequestAnimationFrame the red line in requestAnimationFrame means that you might trigger a Force Layout in JS that calls scrollWidth, clientHeight, ComputedStyle, etc. Causes Recalc Styles and Layout to move forward to code execution.
  4. Parse HTML If there is a DOM change, there is this process of parsing the DOM.
  5. Recalc Styles Recalculates the Styles of the specified element and its children if you modify the style or change the DOM during JS execution.
  6. Layout rearrangement reflow If there is a DOM change or style change involving element position, the browser recalculates the position and size of all elements. Simply changing color, background, and so on does not trigger a rearrangement. As shown in the CSS – triggers.
  7. The update layer tree step is actually to update the cascading sorting relationship of the Render layer, which is the thing we mentioned earlier in order to handle the cascading context. The cascading situation can also change due to the updated style information and rearrangement.
  8. Paint There are two steps in Paint. The first step is to record which Paint calls to execute, and the second step is to execute those Paint calls. The first step is simply to serialize the required operation records into a data structure called SkPicture:

    The SkPicture is a serializable data structure that can capture and then later replay commands, similar to a display list.

    The SkPicture is simply a list of commands you do. The next step is to replay the SkPicture operations, where they are actually performed: rasterizing and filling the carry map. The Paint that we see in the main thread and Timeline is actually the first step in Paint. The second step is the subsequent Rasterize step (see below).

  9. This step in the Composite main thread calculates the data needed for the composition of each Graphics Layers, including parameters for operations such as Translation, Scale, Rotation, and Alpha blending. This is passed to the Compositor Thread, and then the first commit we see in the figure: the Main Thread tells the Compositor Thread, I’m done, you can take over. The main thread will then execute the requestIdleCallback. This step does not actually complete the Graphics Layers bitmap composite.
  10. The Raster Scheduled and Rasterize SkPicture Records generated in step 8 are implemented in this stage.

    SkPicture records on the compositor thread get turned into bitmaps on the GPU in one of two ways: either painted by Skia’s software rasterizer into a bitmap and uploaded to the GPU as a texture, or painted by Skia’s OpenGL backend (Ganesh) directly into textures on the GPU.

    It can be seen that Rasterization actually has two forms:

    • One is CPU-based Software Rasterization using the Skia library, which first draws a carry map and then uploates it to the GPU as a texture. Compositor Thread spawn one or more Compositor Tile Worker threads and then execute the painting operation in SkPicture Records in parallel. Draw the Render Object in the Graphics Layer in units of the Graphics Layer described earlier. At the same time, this process is to split the Layer into multiple small tiles for rasterization and then write into the corresponding bitmap of the tiles.
    • The other is gPU-based Hardware Rasterization, which is also based on Compositor Tile Worker Thread and carried out by tiles. But this process is not like Software Rasterization where you draw a map in the CPU and then upload it to the GPU as a texture. Instead, Skia’s OpenGL Backend (Ganesh) is used to draw, raster, and fill pixels in the texture of the GPU directly. Also known as GPU Raster.

    Now the basic latest version of several major browsers are hardware Rasterization, but for some mobile terminals or Software Rasterization more. Open your Chrome browser and enterchrome://gpu/Take a look at your Chrome’S GPU acceleration. Here is mine:



    The advantage of Hardware Rasterization is that due to the limitation of upload bandwidth before CPU and GPU, the process of uploading bitmap from RAM to GPU VRAM has a non-negligible performance cost. If the area of Rasterization is large, there is a good chance that Software Rasterization will stall there. The following example is a comparison of Chrome32 and Chrome41, versions of which are Hardware Rasterization.



    However, I haven’t found out exactly how pictures and canvas are processed, but I think there is definitely a process of uploading from CPU to GPU, so some cases are not pure Hardware Rasterization, and the two should be used together. In addition, hardware or software Rasterization is mainly determined by the device. There is no room for manual optimization in this area, but some later content is involved here, so a brief introduction is given.

  11. Commit If it is Software Rasterization, the Compositor Thread will commit to inform the GPU Thread after all tiles are rasterized. All tile bitmaps are then uploaded to the GPU as textures by the GPU Thread. If the Hardware Rasterization of the GPU is used, the textures are already in the GPU. Next, the GPU Thread will call the 3D API corresponding to the platform (D3D under Windows, GL on other platforms) and draw all textures into the final bitmap to complete the texture merging. You can perform transformations and transformations of textures quickly before you combine them, thanks to the composition parameters of the 3D API. You can perform transformations and transformations of textures quickly before you combine them. Once the merge is complete, you can render the content to the screen.

Layout, Paint, Rasterize, commit may not happen at all, but Layout may happen more than once. There is also the principle of using compositing layer lifting to obtain GPU-accelerated animation and other related technologies. A more detailed analysis of the above steps follows.

Rearrange Layout and Force Layout

Rearrangements and forced rearrangements are old and familiar, but they can be covered here in conjunction with browser mechanics.

First, if you change a CSS style that affects the Layout information of an element, such as width, height, left, top, etc. (transform excluded), then the browser marks the current Layout as dirty, causing the browser to execute the Layout in the next frame when performing the 11 steps above. Because the location information of an element changes, it may cause the position of other elements of the entire page to change, so you need to perform Layout global recalculation of each element position.

Note that the browser does not rearrange it until the next frame, the next render. JS does not rearrange the CSS immediately after executing the line of change, so you can write 100 lines of change CSS in the JS statement, but only rearrange the CSS once on the next frame.

If you access properties such as offsetTop and scrollHeight while the current Layout is marked dirty, the browser will immediately rearrange the Layout to calculate the correct position of the elements. To ensure that you get the correct offsetTop, scrollHeight, etc. in JS.

Properties and methods that trigger reordering:

This process is called Force Layout rearrangement, and it forces the browser to insert the Layout process that would have been executed in the rendering process into the JS process. The premise is not the problem. The problem is that every time you access a dirty Layout that triggers a rearrangement of properties, you Force Layout, which greatly slows down the efficiency of JS execution.

Force Layout doma.style. width = (doma.offsetwidth + 1) + 'px' Force Layout domb.style. width = (domb.offsetwidth + 1) + 'px'  Force Layout domC.style.width = (domC.offsetWidth + 1) + 'px'Copy the code

The last two of these three lines of code result in a Force Layout, which can take anywhere from tens of microseconds to tens of milliseconds depending on the DOM order of magnitude. Compared to the execution time of a JS line of less than 1 microsecond, this overhead is unacceptable. So there are read and write separation, pure variable storage and other methods to avoid Force Layout. Otherwise, you will see 10 Recalculate Style and Layout images in your Timeline.

In addition, the current Layout is no longer dirty after each rearrangement or forcible rearrangement. So if you access a property like offsetWidth, it doesn’t trigger a rearrangement.

Console. log(doma.offsetwidth) console.log(domb.offsetwidth) //Layout is not dirty Force Layout doma.style. width = (doma.offsetwidth + 1) + 'px' Force Layout console.log(domc.offsetwidth) //Layout no longer dirty, does not trigger rearrangement console.log(doma.offsetwidth) //Layout no longer dirty, - console.log(domb.offsetwidth) will not be rearrangedCopy the code

Redraw Paint

Redraw is similar, as soon as you change the style of an element that triggers redraw, the browser will redraw it in the next frame rendering step. As described in some redrawing mechanisms, JS changes the style causing an area to become invalid, thus redrawing the invalidating area in the next frame.

However, there is a very critical behavior: redraw on a composite layer basis. That is, invalidating is neither the entire document nor a single element, but the composition layer in which this element is present. Of course, this is one of the reasons for splitting the render process into Paint and Compositing:

Since painting of the layers is decoupled from compositing, invalidating one of these layers only results in repainting the contents of that layer alone and recompositing.

Here are two demos: Demo1 and Demo2

The two demos are almost identical, except the second demo has an extra line in the.ab-right style, will-change:transform; . We emphasized will-change: Transform forcing elements to be promoted to the composition layer earlier.

.ab-right { will-change: transform; Position: absolute; right: 0; }Copy the code

So in the second demo, there are two compositing layers: the compositing layer for the HTML root element and the compositing layer for the.ab-right element.

Then we changed the style of the #target element in JS so that the #target element was redrawn at the composition layer of the HTML root element. In Demo1, the. Ab-right element is not promoted to the composition layer, so. Ab-right is also redrawn. In Demo2, the.ab-right element is not redrawn. See first not:



You can obviously see that the ab-right has been redrawn.



Obviously, Demo2 only redraws the content of the composition layer of the HTML root element.

By the way, you can also click on the Raster column to see the specific process of Rasterization. As mentioned earlier, this is where you actually do the work in Paint, drawing the content into a carry map or texture, and it’s done in tiles.

Rearrange and redraw and Compositing

First of all, how to check the composition layer:

Modifying CSS properties such as width, float, border, position, font-size, text-align, overflow-y, etc., triggers rearrangement, redraw, and composition. Modifying other attributes such as color, background-color, visibility, text-decoration, etc., will not trigger rearrangement, but will only be redrawn and synthesized. Please Google for specific attribute list.

As will be stated in many subsequent articles, modifying opacity and transform only triggers composition, not redrawing and composition. So be sure to use these two properties for animation, no redrawing and rearrangement, very efficient.

However, this is not the case.

This is true only if an element has ascended to the synthesis plane.

Back to step 11 of the rendering process we talked about earlier:

You can perform transformations and transformations of textures quickly before you combine them, thanks to the composition parameters of the 3D API. You can perform transformations and transformations of textures quickly before you combine them. Once the merge is complete, you can render the content to the screen.

When multiple synthesis layers are synthesized, relevant parameters of 3D API can be used to directly realize transform and opacity effects of the synthesis layer. So if you elevate an element to a compositing layer and then modify its transform or opacity with JS or apply CSS transitions or animations on top of transform or opacity, it does avoid the CPU’s Paint process, The transform and opacity can be completed directly based on the synthesis parameters of the GPU.

However, this is only done if the compositing layer as a whole has a transform or opacity. For elements that are not promoted to the composition layer, only transform and opacity are displayed on them, which are the contents of the composition layer. Generating the content of the composition layer and writing the bitmap or texture is done in the Paint and Rasterize phases, so the transform and opacity implementation of this element is also done in Paint and Rasterize. So there’s still a rearrangement, and there’s no gPU-accelerated animation enabled.

For example, in this demo, an promoted composite layer div#father and an unpromoted composite layer div#child, 3 seconds later JS changes the transform properties of child and father. What is the flow of the next rendering?

  1. Recalc Styles recalculate Styles
  2. Paint draws the changing composition layer, div#father
    1. Paint draws the parent element’s background and textNode(” Parent promoted to composition layer “)
    2. Paint draws the child element, div#child
      1. Paint translate
      2. Paint draws the background and textNode of the child elements in the moved area.
  3. Rasterize
  4. Composite combines the Composite layer with 3D API parameters to translate the displacement and rotation of the Composite layer. Therefore, translate #father is implemented here

Therefore, as we can see, for elements that have not been enhanced in the synthesis layer, their transform and opacity are implemented in the main thread with Paint and Rasterize (especially for other properties that need to be redrawn), which will still trigger redrawing. Directly changing these two properties with JS will not improve the performance. If the element has been promoted to the synthesis layer, the implementation of transform, opacity and other styles is directly controlled by GPU Thread and completed by Compositing in GPU. The Composite step of the main Thread only calculates the synthesis parameters, which takes a very small time and is extremely fast. Therefore, there is a rule of thumb to use transform and opacity as much as possible to complete animations.

To borrow an example from this article:

div {
    height: 100px;
    transition: height 1s linear;
}

div:hover {
    height: 200px;
}
Copy the code

The transition looks like this:

And if the code looks like this

Div {transform: scale (0.5); transition: transform 1s linear; } div:hover {transform: scale(1.0); }Copy the code



The Compositor and GPU Thread are the Compositor and GPU Thread. The Compositor and GPU Thread are the Compositor and GPU Thread. However, in order to simplify the concept and facilitate the elaboration, the author does not directly mention GPU Thread, so please do not deduct details here.

In addition, in the second example, the reason why div is promoted to the composition layer is actually explained in the previous introduction of the composition layer:

Animation or transition is applied to opacity, Transform, fliter, and backdropfilter (animation or transition must be active, The animation or Transition layer fails before the animation or Transition effect starts or ends.

The content in parentheses is also important. When animating properties such as opacity, the element is not directly promoted to the composition layer, but is promoted to the composition layer at the beginning of animation or transition, and the promotion of the composition layer becomes invalid after the end of animation. Also, elements can be redrawn when promoted to composition or when promoted composition fails. This is why we have Layout the element first time and Paint the element into the bitmap at the beginning of the animation: When the transition starts, the div is not promoted to the composition layer, which immediately causes the composition layer it was originally in to be redrawn (because we need to get rid of the divs that were promoted to the composition layer). Two redrawn layers are Rasterize and uploaded to the GPU.

The demo is here, so see before the animation begins:

The frame at the end of the animation looks like this:

In the demo above, there are only two DOM’s, so the Paint overhead is negligible, but if there are more DOM’s, it is likely to look like this.

This happens in real time, not just during animations and transitions, whenever an element is promoted to the compositing layer, but also before it is promoted and when the compositing layer fails, so on the one hand redrawing incurs drawing overhead, In addition, the upload overhead caused by CPU to GPU bandwidth in texture uploading process (Although Hardware Raster does not need to upload, there are still cases where Hardware Raster cannot be used. And Hardware Raster draws into the texture drawing process itself is also expensive). Poor handling can result in a frame lag before and after the animation starts.

Finally, an important point, and one that is often covered in articles about performance tuning, is this:

Synthesis layer ascension is not a silver bullet.

On the one hand, raising the compositing layer may introduce the overhead of texture generation, uploading and repainting, and the compositing layer will take up GPU VRAM, which is not very large. Both problems are particularly acute on mobile devices. And in introducing the composition layer, I also showed that there is implicit promotion in the composition layer. So please use it wisely.

This article mainly covers the principles, so how to achieve 16ms animation, how to improve rendering performance, how to optimize the number of compositing layers and avoid layer explosions, and what situations will improve compositing layers and trigger redrawing are detailed in the appendix at the end of this article.

conclusion

The text is a fairly detailed introduction to the browser rendering process, which may require you to understand redraw, rearrange, and compose in advance, combined with some demos to delve into some of the points I got wrong.

Again, here are some of the things that upset me:

  • According to HTML5 standard, scroll event is triggered once per frame, with requestAnimationFrame throttling effect
  • In accordance with Blink and Webkit engine implementation, UI inputs such as TouchMove and Mousemove are received by the Compositor thread, but are passed into the main thread once per frame, with requestAnimationFrame throtback effect
  • Redrawing is done in units of composite layers
  • Paint steps before and after the composite layer is promoted

The article, which was first published three weeks ago, was finally finished over the May Day holiday. Shout… .

The resources

Chromium official information

  • GPU Accelerated Compositing in Chrome
  • Compositor Thread Architecture
  • Multithreaded Rasterization
  • How to get GPU Rasterization

Apply colours to a drawing mechanism

  • Jing Jin & Matthew Delaney: The Web’s Black Magic
  • 英 文 How Rendering Work (in WebKit and Blink)
  • Chinese improves page rendering performance
  • GPU Animation: Doing It Right
  • Accelerated Rendering in Chrome
  • Google Developers article, easy to understand, this series is worth reading: rendering performance
  • The Anatomy of a Frame by The Chrome Developer Relations Team

Practical operation performance optimization

  • Website Jank-Busting
  • Taobao FED wireless performance optimization
  • How (not) to trigger a layout in WebKit
  • An Introduction to Hardware Acceleration with CSS Animations
  • What forces layout / reflow
  • csstriggers.com/
  • jankfree.org/
  • PPT Rendering at 60fps
  • Optimising for 60fps everywhere
  • Optimizing CSS3 for GPU Compositing