primers
As a front-end engineer, we must always be highly curious and curious, asking questions and exploring why or how.
So, I asked the question, how do browsers turn “uninteresting” code into “colorful” web pages?
In the figure below, I’m going to show you the process of rendering.
Reference: docs.google.com/presentatio…
CONTENT
First of all, we need to make clear a concept, what we mean by web content, content is responsible for which area, from the perspective of architecture, the namespace of “content” in the Chromium c++ code base is responsible for all the content in the red box in the picture. That is, the TAB bar, the address bar, the navigation button, and the menu are not part of the content of the web page.
Content is a generic term in Chromium that refers to all the code inside a web page or on the front end of a Web application. The basic building units are
- Other, of course, there are many other rendering content, such as canvas, video, webGL…
We go to the web, we press F12, we open a new world, and what do we find? That’s right, a real web page is just thousands of lines of HTML, CSS,javascipt. This source code is the input to the renderer, and there is no tedious process, compiling and packaging, we just give it the source code, and this simplicity is also the key to the early success of the network.
In the Chrome security model, rendering takes place in a sandbox. Blink is a subset of the rendering process code, below the content layer of the web page. Blink implements the semantics of the Web platform API and Web specification.
At the same time, the browser process runs a component called “Compositor (CC)”, crotch salt, which I’ll describe later.
Ps: Previously mentioned a noun, Chromium is what it is, compared to everyone as a developer, Chrome is certainly familiar with, after all, Chrome world invincible is often hang in the mouth, I will briefly introduce Chromium, it is a web browser led by Google development. Chromium development may have started as early as 2006, the design idea is based on simple, high-speed, stable, security and other concepts, in the architecture of the use of Apple developed WebKit engine, Safari part of the source code and Firefox results, and Google exclusively developed the V8 engine, To improve the efficiency of JavaScript interpretation (in Chrome, only HTM rendering using WebKit code, and JS built their own awesome V8 engine, WebKit + V8 strong cooperation, of course, there must be a lot of “love and hate”, I will not say more). Chromium is Google’s initiative to develop its own browser, Google Chrome, so Chromium is an engineering or experimental version of Chrome (although Chrome itself has a beta phase), The new function will be implemented on Chromium first, and will be applied on Chrome after verification, so the function of Chrome will be relatively backward and stable. Chromium updates are fast.
Pixels
At the other end of the rendering pipeline, we have to use the graphics library provided by the underlying operating system to put pixels on the screen. On most platforms today, that’s a standardized API called “OpenGL.” On Windows, there is an additional conversion to DirectX. In the future, we may be able to support new apis, such as Vulkan (originally proposed by the Konas organization and presented at the 2015 Game Developers Conference), which was originally called: Next Generation OpenGL Action, or glNext, but these names haven’t been used since the official announcement, Vulkan plans to deliver high performance and low CPU management burden…) . These libraries provide low-level graphics primifiers such as “textures” and “shaders” and let you do things like “draw a triangle on these coordinates into a virtual pixel buffer.”
Goals
The goal of the rendering process is to convert the HTML/CSS/JS to the correct OpengL call to adjust the pixel style. Not only that, we also have another goal, we also need a correct intermediate data structure, so that we can effectively update after the drawing is completed.
Updates can also be triggered for a number of reasons:
- Js script triggers the update
- User input
- Asynchronous loading
- animation
- Scroll bar scroll
- .
We divide the pipeline into a number of “lifecycle stages”, and I will first describe each phase of the working pipeline before returning to the concept of efficient updating.
DOM
A semantically meaningful hierarchy has been added to HTML documents. For example, a div can contain two paragraphs, each with text. So the first step is to parse these tags to build an object model that maps the structure.
<div>
<p> hello </p>
<p> world </p>
</div>
Copy the code
For example, in the code above, HTML tags can be nested. The DIV contains two P tags, and the P tags contain text. You end up with a tree like this, and of course I recommend that all developers learn principles of Compilation, even front-end development.
Ok, if you look at the picture above, those of you who have a little bit of a computer background will see that this is a tree structure. Its name is Document Object Model, and DOM is a tree structure.
DOM has a dual function:
- The internal representation of the page
- It is also exposed to the API that scripts use to query and modify during rendering
The javascript engine (V8) exposes the DOM Web API as a slimmer wrapper around the actual DOM tree through a system called “binding,” which can have multiple DOM trees within a document, and a shadow tree for custom elements. Children of the shadow tree in the main tree will be assigned to slots in the shadow tree.
The FlatTreeTraversal algorithm transforms :(from host to shadow root node, from slot to assigned node)
STYLE
After building the DOM tree, the next step is to deal with the CSS style. The CSS selector’s property declaration is delegated to the DOM element it selects
Style attributes are the means web page authors use to influence the rendering of DOM elements. There are hundreds of style attributes.
In addition, it is not easy to determine which elements to select for a style rule. Some elements may be selected by more than one rule, and there may be conflicting declarations for specific style attributes, such as the code below. (Complexity)
div:not(.foo) > p:nth-of-type(2n) { color: red ! important; } p { color: blue; }Copy the code
The CSS parser builds a model of style rules from each active style sheet. Stylesheets can be located in < Style > elements, can be a separate loaded resource (styles.css), or can be provided by the browser.
The style parser extracts all the parsed style rules from the active style sheet and calculates the final value of each style attribute for each DOM element. These are stored in an object called ComputedStyle, which is a large mapping from a style property to a value.
As shown above, Chrome Developer Tools will display the “compute style” of any DOM element. This is also exposed to javascript. These are based on Blink’s computed property object. (But some properties are just enhancements to the layout data)
Layout
After you’ve built the DOM and computed all the styles, the next step is to determine the visual geometry of all the elements. For this block-level element we’re going to calculate the coordinates of the rectangle, corresponding to the geometric area occupied by the element in the content area.
In the simplest case, the layout is arranged in DOM order, vertically descending order. We call this a “block flow.” Block flow (as shown below)
Text and inline elements such as inline boxes are generated. Inline boxes usually flow from left to right on a line. This is called inline flow, inline-flow. RTL languages, such as Arabic and Hebrew, reverse the flow of lines from right to left.
The layout requires the use of fonts from calculated styles. The layout uses a library of text shapes called HarfBuzz to calculate the size and position of each glyph, which determines the overall width of the text to run. Fonts must be contained out of the page, as well as the ligatures.
Layouts can calculate multiple types of bounding rectangles for an element. For example, when overflow exists, the layout will calculate the rectangle of the side box and the layout overflow rectangle. If the node overflow is scrollable, the layout also calculates the scrollboundary and reserves space for the scrollbar. The most common scrollable DOM node is the Document node itself (the root of the DOM tree).
More complex layouts require table elements or more complex layouts, such as splitting content into columns, or floating objects on one side with content flowing around them, or text in some East Asian languages running vertically instead of horizontally. Notice how the DOM structure and ComputedStyle values are entered into the layout algorithm. Each pipeline stage uses the results of the previous stages.
Layout operations are performed in a separate layout tree that is associated with the DOM. The nodes of the Layout tree implement the Yin method. LayoutObject has different subclasses (block, text…). Depending on the desired layout behavior. The style update phase also builds the layout tree. The layout phase traverses the layout tree, calculates the visual geometry of each LayoutObject, and performs the layout for each LayoutObject.
In general, a DOM node is a LayoutObject, but sometimes a DOM node has no LayoutObject, sometimes a LayoutObject has no DOM node, sometimes a LayoutObject has no DOM node, sometimes, A DOM node can correspond to multiple LayoutObject. (If a container box has a block box in it, then it can only have block boxes in it)
- In the figure above, a container contains div and span. Span is an inline element and div is a block-level element. To ensure that there are only block-level boxes in the container, an anonymous block-level box is wrapped around span’s layout object. This is what happens when a LayoutObject has no DOM node
- And if the display property in a box’s computed property is None, it also has no corresponding LayoutObject.
The layout tree is constructed based on an algorithm called FlatTreeTraversal. (PS: can be further studied)
The layout engine is being rewritten so that the current layout tree will retain the layout objects of the previous generation and the next generation (NG = next Generation). Of course, in the end, all the layout objects will become the next generation layout objects, and the input and output of the previous generation layout objects will be retained, as well as the layout algorithm. So you can see the state of the tree. In the next generation, layout inputs and outputs are clearly separated, and the output is an unchangeable, cacheable layout result.
The layout result points to a fragment tree that describes the physical geometry.
Example
Here’s an example:
The upper left corner is the code, the lower right corner is the effect,
So what does this code correspond to in the DOM.
<div style="max-width: 100px">
<div style="float: left; padding: 1ex">F</div>
<br>The <b>quick brown</b> fox
<div style="margin: -60px 0 0 80px">jumps</div>
</div>
Copy the code
Yeah, the DOM tree on the left, what kind of LayoutTree does it correspond to.
The result of the layout of the corresponding tree is that need me to explain here, first of all have a block-level elements inside the container div div, therefore, in order to ensure that only block-level elements in a container box, so will use an anonymous block-level box to be wrapped, (note that here the first div because has the float property, so it is no longer a block-level element)
What’s next? Yeah, that’s right, he’s going to compute and generate a tree of fragments, which is going to have information that describes the physical geometry.
Box (block-flow) at 0,0 100x12 Box (block-flow child-inline) at 0,0 100x54 LineBox at 24.9,0 0x18 Box (floating Block-flow children-inline) at 0,0 24.9x34 LineBox at 8,8 8.9x18 Text 'F' at 8,8 8.9x17 Text '\n' at 24.9,0 0x17 LineBox At 24.9,18 67.1x18 Text 'The 'at 24.9,18 28.9x17 Text 'quick' at 53.8,18 38.25x17 LineBox at 0,36 69.5x18 Text 'brown' At 0,36 44.2x17 Text 'fox' at 44.2,36 25.3x17 Box (block-flow child-inline) at 80,-6 20x18 LineBox at 0,0 39.125x18 Text 'jumps' at 0,0 39.125x17Copy the code
The tree will look something like this, and it will have coordinates and width and height information.
Paint draw
Now that we understand the geometry of our layout objects, it’s time to draw them out. Paint the Paint process records the drawn action in a list of display Items. The draw operation might be something like “Draw a rectangle with this color at these coordinates.” Each layout object may have multiple display items that correspond to different parts of its visual appearance, such as background, foreground, outline, and so on.
It is important to draw the elements in the correct order so that they stack correctly when overlapped. Order can be controlled by style (z-index), as shown in the figure below. (Note: z-index only works on positioned elements.)
In the above code, the tree looks like this, but instead of painting the yellow one first and then the green one, the tree looks like this. Instead of painting the yellow one first and then the green one, Paint the stack.
One element can even be partially before and partially after another element. This is because the drawing runs in multiple stages, each of which makes its own traversal of the subtree. Each drawing phase is a separate traversal of a stack context.
For example, in the example above, the blue is after the green, but because foregrounds is after the background, the text is first.
Paint – example
That’s the code and the effects, so what do its display items look like during the draw phase
So first you draw the root node document, then you draw the box, then you draw the foreground color. The draw operation where the text runs consists of a BLOB containing the identifier and offset for each symbol.
conclusion
That wraps up the first half of the browser rendering mechanism. The next article will start with rasterization, and there will be 2-3 articles on the entire browser rendering mechanism, as well as other extensions.
Chicken soup (here’s the key)
Everything that can’t beat you will make you stronger. Come on in 2021. This is for myself and for all the engineers out there.