The author | tully, hill
Reprint | front-end time and space
Source | www.html5rocks.com/zh/tutorial…
The preface
This is a comprehensive introduction to the inner workings of WebKit and Gecko, the result of extensive research by Israeli developer Tali Gaher. Over the past few years, she has looked at all the publicly available data on the internals of browsers
“And spent a lot of time poring over the source code for web browsers. She wrote:
In the days when IE had 90% of the market, there was nothing we could do but treat the browser like a black box. But now, open source browsers have it
More than half the market shareSo it’s time to demystify the web browser and take a look inside. Well, there are only millions of lines of C++ code in there…
Tully in
Her websiteBut we think it deserves to be read by more people, so we’re republishing it here.
As a web developer, learning the inner workings of browsers will help you make more informed decisions and understand why those best development practices work. Although this is quite a long document, we recommend that you take the time to read it carefully; After reading it, you will surely feel that it cost you money.
Introduction to the
The Web browser is probably the most widely used piece of software. In this introductory article, I’ll look behind the scenes at how they work. We’ll learn what happens from the time you type Google.com into the address bar until you see the Google home page on your browser screen.
directory
- Introduction to the
- We’re going to talk about browsers
- The main functions of the browser
- The high-level structure of the browser
- The rendering engine
- The rendering engine
- Main process
- Main process Example
- Parsing and DOM tree building
- Analysis – Overview
- grammar
- A combination of parser and lexical analyzer
- translation
- Parsing example
- Formal definitions of vocabulary and grammar
- Parser type
- Automatically generate parsers
- HTML parser
- HTML Syntax definition
- Non-context-free syntax
- HTML DTD
- DOM
- Parsing algorithm
- Tokenization algorithm
- Tree building algorithm
- Operations after parsing
- Fault tolerance in browsers
- CSS analytical
- WebKit CSS parser
- The order in which scripts and stylesheets are processed
- The script
- Preliminary analysis
- The stylesheet
- Analysis – Overview
- Presentation tree construction
- Rendering tree and DOM tree relationship
- The process of building a rendering tree
- Style calculation
- Sharing style data
- Firefox rule tree
- Its structure
- Evaluate the style context using a rule tree
- Rules are processed to simplify matching
- Apply the rules in the correct cascading order
- Style sheet cascading order
- specificity
- Rule ordering
- Progressive processing
- layout
- The Dirty bit systems
- Global layout and incremental layout
- Asynchronous and synchronous layouts
- To optimize the
- Layout processing
- Width calculation
- A newline
- draw
- Global and incremental drawing
- Drawing order
- Firefox Display list
- WebKit rectangular storage
- Dynamic change
- Render engine threads
- Event loop
- CSS2 visual model
- The canvas
- CSS frame model
- The positioning
- Box type
- positioning
- Relative positioning
- Floating positioning
- Absolute and fixed positioning
- Hierarchical display
- resources
We’re going to talk about browsers
There are five major browsers in use: Internet Explorer, Firefox, Safari, Chrome and Opera. This article uses open source browsers, namely Firefox, Chrome, and Safari (partially open source). Currently (August 2011) Firefox, Safari, and Chrome have a combined market share of nearly 60%, according to StatCounter. As you can see, open source browsers are now a very solid part of the browser market.
The main functions of the browser
The main function of the browser is to make a request to the server to display the network resource of your choice in the browser window. Resources are usually HTML documents, but they can also be PDFS, images, or other types. The location of a resource is specified by the user using a URI (Uniform Resource Identifier).
How browsers interpret and display HTML files is specified in the HTML and CSS specifications. These specifications are maintained by the Web standardization group W3C (World Wide Web Consortium). For years, browsers failed to fully comply with these specifications while developing their own extensions, which created serious compatibility problems for web developers. Today, most browsers are more or less compliant.
The browser user interface has many elements in common with each other, including:
- The address bar used to enter the URI
- Forward and back buttons
- Bookmark Settings options
- Refresh and stop buttons that refresh and stop loading the current document
- The home button to return to the home page
Oddly enough, there is no formal specification of the browser’s user interface, a natural evolution of best practices over the years and imitation of one another. HTML5 also doesn’t define the user interface elements that a browser must have, but lists some common elements, such as the address bar, status bar, and toolbar. Of course, each browser can have its own unique features, such as Firefox’s download manager.
The high-level structure of the browser
The main components of the browser are (1.1) :
- User interface – includes address bar, forward/back buttons, bookmark menu, etc. All parts of the display belong to the user interface, except for the page you requested displayed in the browser’s main window.
- Browser engine – passes instructions between the user interface and the rendering engine.
- Rendering engine – Responsible for displaying the requested content. If the requested content is HTML, it is responsible for parsing the HTML and CSS content and displaying the parsed content on the screen.
- Network – Used for network calls, such as HTTP requests. Its interfaces are platform independent and provide an underlying implementation for all platforms.
- User interface back end – Used to draw basic widgets, such as combo boxes and Windows. It exposes a common interface that is platform-independent, while underneath it uses the operating system’s user interface approach.
- JavaScript interpreter. Used to parse and execute JavaScript code.
- Data storage. This is the persistence layer. Browsers need to keep all kinds of data, such as cookies, on their hard drives. The new HTML specification (HTML5) defines a “web database,” which is a complete (but lightweight) in-browser database.
It’s worth noting that unlike most browsers, Chrome has a separate rendering engine instance for each TAB page. Each TAB is a separate process.
The rendering engine
The rendering engine… Rendering, of course, is displaying the requested content on the browser’s screen.
By default, the rendering engine displays HTML and XML documents and images. Other types of content can also be displayed through plug-ins (or browser extensions); For example, PDF documents can be displayed using the PDF viewer plug-in. But in this chapter, we’ll focus on its main purpose: to display HTML content and images formatted with CSS.
The rendering engine
The browsers discussed in this article (Firefox, Chrome, and Safari) are built on two rendering engines. Firefox uses Gecko, Mozilla’s “homemade” rendering engine. Safari and Chrome both use WebKit.
WebKit is an open source rendering engine originally developed for Linux and later modified by Apple to support Macs and Windows. For more information, see webkit.org.
Main process
The rendering engine will initially fetch the content of the requested document from the network layer, which is generally limited to 8000 blocks.
Then proceed with the basic process as follows:
The rendering engine starts parsing the HTML document and turns each tag into a DOM node on the content tree. It also parses style data in external CSS files and style elements. This style information in HTML with visual instructions will be used to create another tree structure: the render tree.
The rendering tree contains multiple rectangles with visual attributes such as color and size. The order in which these rectangles are arranged is the order in which they will appear on the screen.
Once the rendering tree is built, the “layout” phase of processing is entered, where each node is assigned an exact coordinate that should appear on the screen. The next stage is drawing – the rendering engine traverses the rendering tree, drawing each node from the user interface back-end layer.
It should be emphasized that this is a gradual process. In order to achieve a better user experience, the presentation engine strives to bring content to the screen as quickly as possible. It doesn’t have to wait until the entire HTML document has been parsed to start building the rendering tree and setting up the layout. While receiving and processing the rest of the content from the web, the rendering engine parses and displays some of it.
Main process Example
As you can see from Figures 3 and 4, although WebKit and Gecko use slightly different terms, the overall process is basically the same.
Gecko refers to a tree of visual formatting elements as a “frame tree.” Each element is a framework. WebKit uses the term “render tree”, which consists of “render objects”. WebKit uses the term “layout” for the placement of elements, which Gecko calls “rearrangement.” The term WebKit uses for the process of connecting DOM nodes and visual information to create a rendering tree is “attach.” One subtle non-semantic difference is that Gecko also has a layer called a “content slot” between the HTML and the DOM tree that generates DOM elements. We’ll go through each part of the process:
Analysis – Overview
Parsing is a very important part of the rendering engine, so we’ll cover it in more detail. First, let’s talk about parsing.
Parsing a document means turning it into a meaningful structure that code can understand and use. The result of parsing is usually a tree of nodes that represents the structure of the document, called a parse or syntactic tree.
Example – Parsing
This expression returns the following tree:
grammar
Parsing is based on the grammatical rules that the document follows (the language or format in which the document was written). All parsed formats must correspond to a defined syntax (consisting of words and grammar rules). This is called context-free syntax. Human languages are not such languages and therefore cannot be parsed using conventional parsing techniques.
A combination of parser and lexical analyzer
The process of parsing can be divided into two sub-processes: lexical analysis and grammatical analysis.
Lexical analysis is the process of dividing input into a large number of tags. Markers are words in a language, that is, units of content. In human language, it is the equivalent of a word in a language dictionary.
Grammatical analysis is the process of applying the grammatical rules of a language.
Parsers typically divide the parsing work between two components: a lexical analyzer (sometimes called a tag generator), which breaks the input into valid tags; The parser builds the parse tree by analyzing the structure of the document according to the syntax rules of the language. A lexical analyzer knows how to separate out extraneous characters, such as Spaces and newlines.
Parsing is an iterative process. Typically, the parser asks the lexical parser for a new tag and tries to match it to some syntax rule. If a match is found, the parser adds a node corresponding to that tag to the parse tree, and then proceeds to request the next tag.
If there are no rules to match, the parser stores the tags internally and continues to request tags until it finds a rule that matches all of the internally stored tags. If no match is found, the parser throws an exception. This means that the document is invalid and contains syntax errors.
translation
In many cases, parsing trees are not the final product. Parsing is usually used in translation, which is to convert the input document into another format. Compilation is one such example. The compiler compiles source code into machine code by first parsing the source code into a parse tree, and then translating the parse tree into a machine code document.
Parsing example
In Figure 5, we build the parse tree through a mathematical expression. Now, let’s try to define a simple mathematical language that demonstrates the parsing process.
Vocabulary: The language we use can include integers, plus and minus signs.
Grammar:
- The grammatical units that make up a language are expressions, items, and operators.
- The language we use can contain any number of expressions.
- An expression is defined as an “item” followed by an “operator” followed by an “item”.
- The operators are plus or minus.
- An item is an integer or an expression.
Let’s analyze this
.
The first substring to match the syntax rule is
, and according to syntax rule 5, this is an item. The second substring that matches the syntax rule is
, and according to rule 3 (an item followed by an operator, and then another item), this is an expression. The next match has reached the end of the input.
It’s an expression, because we already know that
Is an item, which follows the rule of “one item, one operator, and then another item.”
Does not match any rules and is therefore invalid input.
Formal definitions of vocabulary and grammar
Terms are usually represented by regular expressions.
For example, our sample language could be defined as follows:
INTEGER :0|[1-9][0-9]*
PLUS : +
MINUS: -Copy the code
As you can see, integers are defined using regular expressions.
Syntax is usually defined using a format called BNF. Our sample language can be defined as follows:
expression := term operation term
operation := PLUS | MINUS
term := INTEGER | expressionCopy the code
As we said earlier, if the syntax of a language is context-free, it can be parsed by a regular parser. An intuitive definition of a context-free grammar is one that can be expressed entirely in BNF format. For a formal definition, see the Wikipedia article on context-free syntax.
Parser type
There are two basic types of parsers: top-down and bottom-up parsers. Intuitively, a top-down parser starts with the high-level structure of the syntax and tries to find a matching structure. Bottom-up parsers, on the other hand, start from low-level rules and gradually convert input into syntax rules until high-level rules are met.
Let’s take a look at how both parsers parse our example:
A top-down parser starts with a high-level rule: first, it will
Identifies it as an expression, and then sets
Identify as an expression (The process of identifying an expression involves matching other rules, but the starting point is the highest-level rule).
The bottom-up parser scans the input, finds a matching rule, and replaces the matching input with a rule. Continue substitution until the end of the input. Partially matched expressions are stored on the parser stack.
The stack | The input |
---|---|
2 plus 3 minus 1
|
|
item |
+ 3 to 1
|
A operation |
3-1
|
expression |
– 1
|
Expression operator |
1
|
expression |
|
This bottom-up parser is called a shift reduction parser because the input shifts to the right (imagine a pointer moving from the beginning of the input to the end) and gradually reduces to the syntax rules.
Automatically generate parsers
There are tools that can help you generate parsers called parser generators. You just give it the syntax (vocabulary and grammar rules) of your language, and it generates the corresponding parser. Creating parsers requires a deep understanding of parsing, and creating and optimizing parsers manually is not an easy task, so parser generators are very useful.
WebKit uses two well-known parser generators: Flex for creating lexical parsers and Bison for creating parsers (you may also encounter aliases like Lex and Yacc). Flex’s input is a file containing the regular expression definition of the tag. The input to Bison is the language syntax rules in BNF format.
HTML parser
The HTML parser’s job is to parse HTML tags into parse trees.
HTML Syntax definition
HTML vocabulary and syntax are defined in specifications created by the W3C organization. The current version is HTML4 and HTML5 is in process.
Non-context-free syntax
As we learned in the introduction to the parsing process, the syntax can be formally defined in formats such as BNF.
Unfortunately, all the regular parsers don’t work with HTML (I’m not kidding, they work with CSS and JavaScript). HTML is not easily defined with the context-free syntax required by the parser.
There is a formal format for defining HTML: a Document Type Definition (DTD), but it is not a context-free syntax.
This may seem strange at first: HTML and XML are very similar. There are many XML parsers available. There is an XML variant of HTML (XHTML), so what’s the big difference?
The difference is that HTML is more forgiving, allowing you to omit some implicitly added markup, sometimes starting or ending markup, and so on. Unlike XML’s strict syntax, HTML as a whole is a “soft” syntax.
Apparently, this seemingly small difference can actually have a huge impact. On the one hand, this is why HTML is so popular: it tolerates your mistakes and simplifies web development. On the other hand, this makes it difficult to write formal syntax. In summary, HTML cannot be easily parsed by a regular parser (because its syntax is not context-free), nor can it be parsed by an XML parser.
HTML DTD
HTML is defined in DTD format. This format can be used to define languages of the SGML family. It includes definitions of all permitted elements and their attributes and hierarchies. As mentioned above, HTML DTDS cannot constitute context-free syntax.
There are some variants of DTDS. Strict mode fully complies with the HTML specification, while other modes support markup used by previous browsers. The goal is to ensure backward compatibility with some of the earlier versions. The latest Strict schema DTD can be found here: www.w3.org/TR/html4/st…
DOM
The parser’s output “parse tree” is a tree structure made up of DOM elements and attribute nodes. DOM stands for Document Object Model. It is an object representation of an HTML document and an interface between external content (such as JavaScript) and HTML elements. The root node of the parse tree is the “Document” object.
There is almost a one-to-one correspondence between DOM and markup. For example:
<html>
<body>
<p>
Hello World
</p>
<div> <img src="example.png"/></div>
</body>
</html>Copy the code
This can be translated into the following DOM tree:
Like HTML, DOM is specified by the W3C organization. See www.w3.org/DOM/DOMTR. This is a general specification for document manipulation. One particular module describes elements that are specific to HTML. The definition of HTML can be found here: www.w3.org/TR/2003/REC… .
When I say a tree contains DOM nodes, I mean that the tree is made up of elements that implement some DOM interface. The browser will have additional properties in the concrete implementation for internal use.
Parsing algorithm
As we said in the previous section, HTML cannot be parsed with a conventional top-down or bottom-up parser.
Here’s why:
- The forgiving nature of language.
- Browsers have historically been tolerant of some common invalid HTML uses.
- The parsing process needs to be repeated. The source content usually does not change during parsing, but in HTML, script tags if contained
document.write
, additional tags are added so that the parsing process actually changes the input.
Since normal parsing techniques are not available, browsers create custom parsers to parse HTML.
The HTML5 specification describes parsing algorithms in great detail. The algorithm consists of two phases: tokenization and tree construction.
Tokenization is a lexical analysis process that parses input into multiple tags. HTML tags include start tags, end tags, attribute names, and attribute values.
The tag generator recognizes the tag, passes it to the tree constructor, and then accepts the next character to recognize the next tag; Repeat until the end of the input.
Tokenization algorithm
The output of this algorithm is an HTML tag. The algorithm is represented by a state machine. Each state receives one or more characters from the input information stream and updates the next state based on those characters. The current tokenization state and tree structure state influence the decision to move to the next state. This means that even if the same characters are received, the next correct state will yield different results, depending on the current state. The algorithm is too complex to go into detail here, so we’ll use a simple example to help you understand how it works.
Basic example – tokenize the following HTML code:
<html>
<body>
Hello world
</body>
</html>Copy the code
The initial state is the data state. When a character < is encountered, the state changes to token open state. Receiving an A-Z character creates the “start tag” and the state changes to the “tag Name state”. This state remains until the > character is received. Each character received during this period is appended to the new tag name. In this case, the tags we create are HTML tags.
When a > tag is encountered, the current tag is sent and the state changes back to “data state.” The same is done for the tag. The HTML and body tags are now issued. Now let’s go back to “data state.” When an H character is received in Hello World, a character token is created and sent until the < in is received. We will send a character token for each character in Hello World.
Now let’s go back to the Tag Open state. When the next input character/is received, the End Tag token is created and changed to “Tag Name status”. We will hold this state again until receive >. The new tag is then sent and the “data state” is returned. The same is done for
input.
Tree building algorithm
When the parser is created, the Document object is also created. During the tree construction phase, the DOM tree with the Document as the root node is also constantly modified to add various elements to it. Each node sent by the tag generator is processed by the tree builder. The SPECIFICATION defines the DOM elements for each tag, which are created when the corresponding tag is received. These elements are added not only to the DOM tree, but also to the stack of open elements. This stack is used to correct nesting errors and handle unclosed tags. The algorithm can also be described by state machines. These states are called “insertion modes.”
Let’s look at the tree building process for the sample input:
<html>
<body>
Hello world
</body>
</html>Copy the code
The input to the tree building phase is a sequence of tags from the tokenization phase. The first mode is “Initial Mode”. The HTML tag is received into “before HTML” mode, and the tag is reprocessed in this mode. This creates an HTMLHtmlElement and appends it to the Document root object.
The state will then change to “before head”. At this point we receive the “body” tag. Even without the “head” tag in our example, the system implicitly creates an HTMLHeadElement and adds it to the tree.
Now we’re in “in head” mode, and then we’re in “After head” mode. The body tag is reprocessed, HTMLBodyElement is created and inserted, and the mode changes to “in body”.
Now you receive a series of character tokens generated by the “Hello World” string. A “Text” node is created and inserted when the first character is received, and other characters are appended to that node.
Receiving the body end tag triggers the “After Body” mode. Now we will receive the HTML closing tag and go into “After After Body” mode. When the end-of-file flag is received, the parsing process ends.
Operations after parsing
At this stage, the browser marks the document as interactive and begins parsing scripts that are in deferred mode, that is, scripts that should not be executed until after the document has been parsed. The document state will then be set to Done, and a Load event will fire.
You can see the full algorithm for tokenization and tree building in the HTML5 specification
Fault tolerance in browsers
You never see an “invalid syntax” error when browsing an HTML web page. This is because the browser corrects any invalid content and continues working.
Take the following HTML code for example:
<html>
<mytag>
</mytag>
<div>
<p>
</div>
Really lousy HTML
</p>
</html>Copy the code
I’ve broken a lot of syntax rules here (” mytag “is not a standard tag, the nesting between” p “and” div “elements is wrong, etc.), but the browser will still display it correctly without complaint. There is a lot of parser code that corrects HTML page authors.
The error-handling mechanism is fairly consistent across browsers, but surprisingly, it is not part of the current HTML specification. Like bookmark management and forward/back buttons, it is a product of browser development over the years. There are known invalid HTML structures prevalent on many web sites, and every browser tries to fix them the same way everyone else does.
The HTML5 specification defines some of these requirements. WebKit gives a good overview of this in the opening comment of the HTML parser class.
The parser parses the tokenized input to build the document tree. If the document is formatted correctly, parse it directly.
Unfortunately, we have to deal with a lot of misformed HTML documents, so the parser must be fault-tolerant.
We should at least be able to handle the following error cases:
- Elements that obviously cannot be added in certain external tags. In this case, we should close all tags until a forbidden element appears, and then add that element.
- We can’t add elements directly. It is likely that the page author forgot to add some of these tags (or that the tags are optional). These tags might include: HTML HEAD BODY TBODY TR TD LI .
- Adds a block element to an inline element. Close all inline elements until the next higher level block element appears.
- If this still does not work, you can close all elements until you can add them, or you can ignore the tag.
Let’s look at some examples of WebKit fault tolerance:
Used instead of
Some sites use
instead of
. For compatibility with IE and Firefox, WebKit treats it the same as
. The code is as follows:
if (t->isCloseTag(brTag) && m_document->inCompatMode()) {
reportError(MalformedBRError);
t->beginTag = true;
}Copy the code
Note that error handling is done internally and not seen by the user.
Discrete form
A discrete table is a table that is in the contents of another table but not in any cell. For example:
<table>
<table>
<tr><td>inner table</td></tr>
</table>
<tr><td>outer table</td></tr>
</table>Copy the code
WebKit changes its hierarchy to two sibling tables:
<table>
<tr><td>outer table</td></tr>
</table>
<table>
<tr><td>inner table</td></tr>
</table>Copy the code
The code is as follows:
if (m_inStrayTableContent && localName == tableTag)
popBlock(tableTag);Copy the code
WebKit uses a stack to hold the current element content, which pops the internal table from the stack of external tables. Now, the two tables are peer.
Nested form elements
If the user places another form within a form element, the second form is ignored. The code is as follows:
if(! m_currentFormElement) { m_currentFormElement = new HTMLFormElement(formTag, m_document); }Copy the code
Overly complex markup hierarchies
The comments to the code are clear.
The example site, www.liceo.edu.mx, has about 1,500 nested tags, all from a bunch of <b> tags. We only allow up to 20 levels of nesting of tags of the same type, and any more will be ignored altogether.
bool HTMLParser::allowNestedRedundantTag(const AtomicString& tagName)
{
unsigned i = 0;
for (HTMLStackElem* curr = m_blockStack;
i < cMaxRedundantTagDepth && curr && curr->tagName == tagName;
curr = curr->next, i++) { }
returni ! = cMaxRedundantTagDepth; }Copy the code
Misplaced HTML or body closing tag
Again, the comments to the code are clear.
Support for very poorly formed HTML code. We never close the body tag because some stupid web pages close before the actual document ends. We do this by calling end().
if (t->tagName == htmlTag || t->tagName == bodyTag )
return;
Copy the code
So web authors need to be careful to write well-formed HTML code unless you want to appear as an example of how not to use WebKit fault-tolerant snippets.
CSS analytical
Remember the concept of parsing from the introduction? Unlike HTML, CSS is a context-free syntax that can be parsed using the various parsers described in the introduction. In fact, the CSS specification defines the lexical and syntax of CSS.
Let’s look at some examples: The lexical syntax (vocabulary) is defined with regular expressions for individual tags:
comment \/\*[^*]*\*+([^/*][^*]*\*+)*\/
num [0-9]+|[0-9]*"."[0-9]+
nonascii [\200-\377]
nmstart [_a-z]|{nonascii}|{escape}
nmchar [_a-z0-9-]|{nonascii}|{escape}
name {nmchar}+
ident {nmstart}{nmchar}*
Copy the code
Ident is an acronym for an identifier, such as a class name. “Name” is the element’s ID (referenced by the “#”).
The syntax is described in BNF format.
ruleset
: selector [ ', ' S* selector ]*
'{' S* declaration [ '; ' S* declaration ]* '} 'S* ; selector : simple_selector [ combinator selector | S+ [ combinator? selector ]? ] ? ; simple_selector : element_name [ HASH | class | attrib | pseudo ]* | [ HASH | class | attrib | pseudo ]+ ; class :'. ' IDENT
;
element_name
: IDENT | The '*'
;
attrib
: '[' S* IDENT S* [ [ '=' | INCLUDES | DASHMATCH ] S*
[ IDENT | STRING ] S* ] '] '
;
pseudo
: ':' [ IDENT | FUNCTION S* [IDENT S*] ') '];Copy the code
Explanation: This is the structure of a rule set:
div.error , a.error {
color:red;
font-weight:bold;
}Copy the code
Div. Error and a.ror are selectors. The sections in braces contain the rules that apply to this rule set. The formal definition of this structure looks like this:
ruleset
: selector [ ', ' S* selector ]*
'{' S* declaration [ '; ' S* declaration ]* '} ' S*
;Copy the code
This means that a rule set is a single selector, or an optional number of selectors separated by commas and Spaces (S for Spaces). The rule set contains braces and one or more of them (optionally) declarations separated by semicolons. “Declaration” and “selector” will be defined by the BNF format below.
WebKit CSS parser
WebKit uses Flex and Bison parser generators to automatically create parsers from CSS syntax files. As we mentioned earlier in the parser introduction, Bison creates a bottom-up shift reduction parser. Firefox uses a hand-written top-down parser. Both parsers parse CSS files into StyleSheet objects, and each object contains CSS rules. CSS rule objects contain selectors and declaration objects, as well as other objects that correspond to CSS syntax.
The order in which scripts and stylesheets are processed
The script
The model of the network is synchronous. Web page authors expect the parser to parse and execute the script as soon as it encounters the <script> tag. Parsing of the document will stop until the script completes. If the script is external, the parsing process stops until the synchronous fetching of resources from the network is complete. This model has been in use for many years and is specified in the HTML4 and HTML5 specifications. The author can also annotate the script as “defer” so that it does not stop parsing the document but waits until the parsing is complete. HTML5 adds an option to mark scripts as asynchronous so they can be parsed and executed by other threads.
Preliminary analysis
Both WebKit and Firefox have made this optimization. As the script executes, other threads parse the rest of the document to find and load additional resources that need to be loaded over the network. In this way, resources can be loaded on parallel connections, increasing overall speed. Note that the pre-parser does not modify the DOM tree, but hands that job off to the main parser; The pre-parser only resolves references to external resources, such as external scripts, stylesheets, and images.
The stylesheet
Stylesheets, on the other hand, have different models. In theory, applying a stylesheet does not change the DOM tree, so there seems to be no need to wait for the stylesheet and stop parsing the document. One problem with this, however, is that the script requests style information during the document parsing phase. If the style has not been loaded and parsed at that time, the script will get the wrong reply, which can obviously cause a lot of problems. This may seem like an atypical case, but it’s actually quite common. Firefox disables all scripts during stylesheet loading and parsing. WebKit, on the other hand, disallows a script only if the style property it is trying to access may be affected by an unloaded stylesheet.
Presentation tree construction
While the DOM tree is being built, the browser builds another tree structure: the rendering tree. This is a tree of visual elements in their display order and a visual representation of the document. It lets you draw the content in the correct order.
Firefox refers to elements in the rendering tree as “frames.” WebKit uses the term renderer or render object. The renderer knows how to lay out and draw itself and its children. The WebKits RenderObject class is the base class for all renderers and is defined as follows:
class RenderObject{
virtual void layout();
virtual void paint(PaintInfo);
virtual void rect repaintRect();
Node* node; //the DOM node
RenderStyle* style; // the computed style
RenderLayer* containgLayer; //the containing z-index layer
}Copy the code
Each renderer represents a rectangular area, usually corresponding to a CSS box for the node in question, as described in the CSS2 specification. It contains geometric information such as width, height, and position. The type of box is affected by the “display” style attribute associated with the node (see the section on style calculation). The following WebKit code describes what type of renderer should be created for the same DOM node, depending on the display attribute.
RenderObject* RenderObject::createObject(Node* node, RenderStyle* style) { Document* doc = node->document(); RenderArena* arena = doc->renderArena(); . RenderObject* o = 0; switch (style->display()) {case NONE:
break;
case INLINE:
o = new (arena) RenderInline(node);
break;
case BLOCK:
o = new (arena) RenderBlock(node);
break;
case INLINE_BLOCK:
o = new (arena) RenderBlock(node);
break;
case LIST_ITEM:
o = new (arena) RenderListItem(node);
break; . }return o;
}Copy the code
Element types are also a consideration, such as form controls and tables that correspond to specific frames.
In WebKit, if an element needs to be created with a special renderer, it is replaced
createRenderer
Methods. The style object that the renderer points to contains some geometrically independent information.
Rendering tree and DOM tree relationship
Renderers correspond to DOM elements, but not one-to-one. Non-visual DOM elements, such as the “head” element, are not inserted into the rendering tree. Elements whose display attribute value is “None” will not be displayed in the render tree (elements whose visibility attribute value is “hidden” will still be displayed).
There are DOM elements that correspond to multiple visual objects. They tend to be elements with complex structures that cannot be described by a single rectangle. For example, the “Select” element has three renderers: one for the display area, one for the drop-down list box, and one for the button. If text is broken into multiple lines because it is not wide enough to display on one line, new lines are added as new renderers. Another example of multiple renderers is invalid HTML. According to the CSS specification, an inline element can only contain a block element or one of the inline elements. If mixed content occurs, an anonymous block renderer should be created to wrap inline elements.
There are rendering objects that correspond to DOM nodes but are located differently in the tree. Such is the case with floating and absolute positioning elements, which are outside the normal flow, placed elsewhere in the tree and mapped to the real frame, with the placeholder frame in place.
The process of building a rendering tree
In Firefox, the system registers the presentation layer as a listener for DOM updates. The presentation layer delegates frame creation to FrameConstructor, which parses the style (see Style calculation) and creates the frame.
In WebKit, the process of parsing styles and creating renderers is called “attaching.” Each DOM node has an “Attach” method. Attaching is done synchronously, and inserting a node into the DOM tree requires calling a new node “Attach” method.
Processing the HTML and body tags builds the render root node. This root node rendering object corresponds to what the CSS specification calls a container block, which is the topmost block that contains all the other blocks. Its size is the viewport, the size of the display area of the browser window. Firefox calls it ViewPortFrame, and WebKit calls it RenderView. This is the render object to which the document points. The rest of the rendering tree is built in the form of DOM tree node inserts.
See the CSS2 specification for processing models.
Style calculation
When building a rendering tree, you need to calculate the visual properties of each rendering object. This is done by evaluating the style attributes of each element.
Styles include stylesheets from various sources, inline style elements, and visual attributes in HTML (such as the “BGColor” attribute). The latter will be transformed to match the CSS style properties.
Stylesheet sources include the browser’s default stylesheets, stylesheets provided by page authors, and user stylesheets provided by browser users (browsers allow you to define styles as you like). In Firefox, for example, users can place their favorite stylesheets in the Firefox Profile folder).
There are the following difficulties in style calculation:
- Style data is an oversized structure that stores countless style attributes, which can cause memory problems.
-
Without tuning, finding matching rules for every element can cause performance problems. It’s a huge task to traverse the entire rule list for each element to find a match. Selectors can have very complex structures, which can lead to a matching process that at first looks likely to be correct, but turns out to be futile, and other matching paths must be tried.
For example, the following combinatorial selector:
div div div div{ ... }Copy the code
This means that the rule applies to those that are children of three div elements
<div>
. If you want to check whether the rule applies to a specified<div>
Element should be checked by selecting an upward path in the tree. You might need to walk up the node tree only to find that there are only two divs, and the rules don’t apply. You must then try other paths in the tree. - Applying rules involves a fairly complex cascade of rules that define the layers of those rules.
Let’s take a look at how browsers handle these issues:
Sharing style data
WebKit nodes reference style objects (RenderStyle). These objects can be shared by different nodes in some cases. These nodes are peers and:
- These elements must be in the same mouse state (for example, one is not allowed to be in the “:hover” state while the other is not)
- There is no ID for any element
- Tag names should match
- Class attributes should match
- The set of mapping attributes must be identical
- Link states must match
- The focus states must match
- No element should be affected by the attribute selector, and by “affected” I mean any selector matches that use the attribute selector anywhere in the selector
- The element cannot have any inline style attributes
- You cannot use any sibling selectors. WebCore only raises a global switch when it encounters any sibling selector and deactivates style sharing for the entire document, if one exists. This includes + selectors as well as :first-child and: last-Child selectors.
Firefox rule tree
To simplify style calculations, Firefox uses two other trees: the rule tree and the style context tree. WebKit also has style objects, but they are not stored in a tree structure like a style context tree, just the associated styles of such objects that are pointed to by DOM nodes.
The style context contains the end values. To calculate these values, apply all the matching rules in the correct order and convert them from logical values to concrete values. For example, if the logical value is a percentage of screen size, you need to convert it to absolute units. The idea of a rule tree is really clever, as it allows nodes to share these values to avoid double-counting and to save space.
All matching rules are stored in the tree. The underlying node in the path has a higher priority. The rule tree contains the path of all known rule matches. The storage of rules is delayed. The rule tree does not start the calculation for all nodes, but adds a path to the rule tree only when a node style requires the calculation.
This idea is equivalent to treating rule tree paths as words in a dictionary. If we have computed the following rule tree:
Suppose we need to match rules for another element in the content tree, and find that the matching path is B-e-i (in that order). Since we have calculated the path A-B-e-i-L in the tree, we already have this path, which reduces the amount of work required now.
Let’s look at how a rule tree can help us reduce work.
Its structure
A style context can be split into multiple structures. These structures contain style information for specific categories, such as border or color. Properties in a structure are inherited or non-inherited. An inherited attribute, if not defined by an element, inherits from its parent. Non-inherited attributes (also known as “reset” attributes) use default values if they are not defined.
The rule tree helps us by caching the entire structure, including the calculated end values. This idea assumes that the underlying node does not provide the definition of the structure, and the cache structure in the upper node can be used.
Evaluate the style context using a rule tree
To evaluate the style context for a particular element, we first evaluate the corresponding path in the rule tree, or use an existing path. We then apply the rules along this path to populate the structure in the new style context. We start with the lowest node in the path that has the highest priority (usually the most special selector) and work our way up the rule tree until the structure is filled. If the rule node does not have any specification for the structure, then we can do a better optimization by looking for a node further up the path, specifying the full specification and pointing to the relevant node. This is the best optimization because the entire structure can be shared. This reduces the computation of end values and saves memory. If we find a partial definition, we walk up the rule tree until the structure is filled.
If we can’t find any definition of the structure, then if the structure is of “inherited” type, we point to the parent structure in the context tree so that the structure can also be shared. If the structure is of type reset, the default value is used.
If the most special nodes do add values, then we need to do some additional calculations to convert those values into actual values. We then cache the results in tree nodes for use by children.
If an element and its sibling point to the same tree node, they can share the entire style context.
Let’s take a look at an example. Suppose we have the following HTML code:
<html>
<body>
<div class="err" id="div1">
<p>
this is a <span class="big"> big error </span>
this is also a
<span class="big"> very big error</span> error
</p>
</div>
<div class="err" id="div2">another error</div>
</body>
</html>Copy the code
And here are the rules:
div {margin:5px; color:black}.err {color:red}.big {margin-top:3px}div span {margin-bottom:4px}#div1 {color:blue}#div2 {color:green}Copy the code
For simplicity, we only need to fill in two structures: the color structure and the margin structure. The color structure contains only one member (the “color”), while the margin structure contains four edges. The rule tree formed is as follows (the marking method of nodes is “node name: rule number pointing to”) :
The context tree looks like this (node name: the rule node pointed to) :
Suppose we encounter a second
Now we need to fill in the style structure. The first thing to fill in is the margin structure. Since the last rule node (F) is not added to the margin structure, we need to go up the rule tree until we find the cache structure computed in the previous node insertion, and then use that structure. We will find this structure on the topmost node (B node) of the specified margin rule.
We already have the color structure defined, so we can’t use the cached structure. Since color has one property, we do not need to go up the rule tree to populate the other properties. We compute side values (converting strings to RGB, etc.) and cache the computed structure on this node.
The second <span> element is simpler to handle. We will match the rule and find that it points to rule G just like the previous span. Now that we have found a sibling pointing to the same node, we can share the entire style context, just pointing to the previous span context.
For structures that contain rules inherited from their parents, caching is done in the context tree (the color property is actually inherited, but Firefox treats it as a reset property and caches it in the rule tree). For example, if we add a font rule to a paragraph:
p {font-family:Verdana; font size:10px; font-weight:bold}Copy the code
The paragraph element, as a child of the div in the context tree, shares the same font structure as its parent (provided that the paragraph does not specify a font rule).
There is no rule tree in WebKit, so matching declarations are traversed four times. Non-important high-priority attributes are applied first (attributes that should be applied first because they serve as a basis for other attributes, such as display), then high-priority important rules, then normal-priority non-important rules, and finally normal-priority important rules. This means that attributes that occur multiple times are resolved in the correct cascading order. The last one that appears is finally effective.
So in a nutshell, sharing style objects (the whole object or part of the structure within the object) solves problems 1 and 3. The Firefox rule tree also helps to apply properties in the correct order.
Rules are processed to simplify matching
Style rules come from several sources:
- CSS rules in external stylesheets or style elements
p {color:blue}Copy the code
- Inline style properties and similar content
<p style="color:blue" />Copy the code
- HTML visual attributes (mapped to associated style rules)
<p bgcolor="blue" />Copy the code
The latter two are easy to match with elements because they have style attributes and HTML attributes can be mapped using elements as key values.
As we mentioned earlier in Question 2, CSS rule matching can be tricky. To solve this problem, you can do some processing on the CSS rules for easy access.
After the stylesheet is parsed, CSS rules are added to a hash table based on the selector. The selectors for these hash tables vary, including IDS, class names, tag names, and so on, as well as a generic hash table for rules that do not fall into the above categories. If the selector is an ID, the rule is added to the ID table; If the selector is a class, the rules are added to the class table, and so on. This process can greatly simplify rule matching. We don’t need to look at every declaration, just the rules for extracting elements from the hash table. This optimization method can eliminate more than 95% of the rules, so they do not need to be considered in the matching process at all (4.1).
Let’s take the following style rule as an example:
p.error {color:red}
#messageDiv {height:50px}
div {margin:5px}Copy the code
The first rule inserts the class table, the second inserts the ID table, and the third inserts the tag table.
For the following HTML snippet:
<p class="error">an error occurred </p>
<div id="messageDiv">this is a message</div>Copy the code
We first look for a matching rule for the p element. There is an “error” key in the class table, and the rule for “P.ror” can be found below. Div elements have rules in the ID table (key ID) and tag table. All that’s left is to figure out which key extraction rules really match. For example, if div has the following rule:
table div {margin:5px}Copy the code
This rule will still be extracted from the tag table because the key is the rightmost selector, but it doesn’t match our DIV element because div has no table ancestor.
Both WebKit and Firefox do this.
Apply the rules in the correct cascading order
The style object has attributes (all CSS attributes but more generic) that correspond to each visual attribute. If an attribute is not defined by any matching rules, then some attributes can be inherited by the parent element style object. Other attributes have default values.
If there is more than one definition, there is a problem that needs to be solved by cascading order.
Style sheet cascading order
The declaration of a style property may appear in more than one style sheet, or it may appear multiple times in the same style sheet. This means that the order in which rules are applied is extremely important. This is called a “cascading” order. According to the CSS2 specification, the cascade order is (from low priority to high priority) :
- Browser declaration
- User General Statement
- Author general Statement
- Author’s Important Statement
- User Important Statement
The browser declaration is the least important, and the user must mark it as “important” to replace the declaration of the page author. Declarations of the same order are sorted by specificity, followed by the order in which they are specified. HTML visual attributes are converted into matching CSS declarations. They are considered low-priority web page author rules.
specificity
The specificity of the selector is defined by the CSS2 specification as follows:
- 1 if the declaration comes from the “style” property rather than a rule with a selector, 0 otherwise (= a)
- Denotes the number of ID attributes in the selector (= b)
- Denoted as the number of other attributes and pseudo-classes in the selector (= c)
- Denoted as the element name and the number of pseudo-elements in the selector (= d)
The concatenation of four digits in the order a-B-C-D (in a large number system) constitutes specificity.
The base you use depends on the highest count in the category above. For example, if a=14, you can use hexadecimal. If a=17, then you need to use the seventeenth base; This is unlikely, of course, unless there are selectors such as: HTML Body div div P… (There are 17 tokens in the selector, which is highly unlikely).
Some examples:
* {} /* a=0 b=0 c=0 d=0 -> nine =0 c=0 d=1 -> nine =0 c=0 d=1 -> nine =0 {} / * = 0 b = 0 c = 0 d = 2 - > specificity = 0,0,0,2 * / ul li {} / * = 0 b = 0 c = 0 d = 2 - > specificity = 0,0,0,2 * / ul ol li + {} / * A =0 b=0 C =0 D =3 -> nine =0,0,0,3 */ h1 + *[rel=up]{} /* a=0 b=0 c=1 d=1 -> nine =0,0,1,1 */ ul OL Li.red {} /* a=0 b=0 c=1 d=3 -> auus =0,0,1,3 */ li.redA = 0 # x34y {} / * b = c = 0 1 d = 0 - > specificity = 0,1,0,0 * /
style=""/* a=1 b=0 c=0 d=0 -> SPECIFICITY =1,0,0,0 */Copy the code
Rule ordering
Once the matching rules are found, they should be sorted according to the cascading order. WebKit uses bubble sort for smaller lists and merge sort for larger ones. For the following rules, WebKit implements sorting by replacing the “>” operator:
static bool operator >(CSSRuleData& r1, CSSRuleData& r2)
{
int spec1 = r1.selector()->specificity();
int spec2 = r2.selector()->specificity();
return (spec1 == spec2) : r1.position() > r2.position() : spec1 > spec2;
}Copy the code
Progressive processing
WebKit uses a flag to indicate whether all top-level stylesheets (including @imports) have been loaded. If the style is not fully loaded during appending, placeholders are used, annotated in the document, and recalculated after the style sheet is loaded.
layout
Renderers do not contain location and size information when they are created and added to the rendering tree. The process of calculating these values is called layout or rearrangement.
HTML uses a flow-based layout model, which means that geometric information can be calculated in most cases in a single walk. Elements at the back of the stream usually do not affect the geometry of elements at the front, so the layout can traverse the document from left to right and top to bottom. There are exceptions, however, where the calculation of an HTML table requires more than one traversal (3.5).
The coordinate system is established with respect to the root frame, using upper and left coordinates.
Layout is a recursive process. It starts with the root renderer (the < HTML > element corresponding to the HTML document) and recursively traverses some or all of the frame hierarchies, computing geometric information for each renderer that needs to be computed.
To the left of the root renderer position is 0,0, and its size is viewport (that is, the visible area of the browser window).
All renderers have a “Layout” or “reflow” method, and each renderer calls the Layout method of its offspring that needs to be laid out.
The Dirty bit systems
To avoid laying out all the small changes, browsers use a system of “dirty bits”. If a renderer changes or marks itself and its children as “dirty,” layout is required.
There are two types of markers: “dirty” and “children are dirty”. “Children are dirty” indicates that although the renderer itself has not changed, at least one of its children needs to be laid out.
Global layout and incremental layout
A global layout is a layout that triggers the entire rendering tree range for possible reasons:
- Global style changes that affect all renderers, such as font size changes.
- Screen size adjustment.
The layout can be incremental, meaning that only dirty renderers are laid out (which may have the downside of requiring additional layout). When the renderer is dirty, the incremental layout is triggered asynchronously. For example, when additional content from the network is added to the DOM tree, a new renderer is attached to the rendering tree.
Asynchronous and synchronous layouts
Incremental layouts are performed asynchronously. Firefox queues “reflow commands” for incremental layouts, and the scheduler triggers batch execution of these commands. WebKit also has timers for performing incremental layouts: traversing the render tree and laying out the dirty renderers.
Scripts that request style information (such as “offsetHeight”) trigger incremental layouts synchronously.
Global layouts tend to be triggered synchronously.
Sometimes, after the initial layout is complete, the layout is triggered as a callback if some properties, such as the scroll position, change.
To optimize the
If the layout is triggered by “resizing” or a change in the position (rather than the size) of the renderer, the renderer size can be retrieved from the cache without recalculation.
In some cases, only one subtree is modified, so there is no need to start the layout from the root node. This is useful for making changes locally without affecting surrounding elements, such as inserting text into a text field (otherwise every keyboard input will trigger the layout from the root node).
Layout processing
Layouts typically have the following patterns:
- The parent renderer determines its own width.
- The parent renderer processes the child renderers in turn, and:
- Place the child renderer (set x,y coordinates).
- If necessary, the layout of the child renderer is called (if the child renderer is dirty, or this is a global layout, or for some other reason), which counts the height of the operator renderer.
- The parent renderer sets its own height based on the cumulative height of the child renderer and the height of margins and padding, which can also be used by the parent renderer’s parent renderer.
- Set its dirty bit to false.
Firefox uses a “state” object (nsHTMLReflowState) as a layout parameter (called “reflow”), which includes the width of the parent renderer. The output of the Firefox layout is the “metrics” object (nsHTMLReflowMetrics), which contains the calculated renderer height.
Width calculation
The renderer width is calculated from the width of the container block, the “width” attribute in the renderer style, and margins and borders. For example, the width of a div:
<div style="width:30%"/>Copy the code
This will be computed by WebKit as follows (BenderBox class, calcWidth method) :
- The container width takes the greater of the container’s availableWidth and 0. AvailableWidth is equivalent to contentWidth in this example, calculated as follows:
clientWidth() - paddingLeft() - paddingRight()Copy the code
ClientWidth and clientHeight represent the interior of an object (minus the border and scrollbar).
- The width of the element is the “width” style attribute. It calculates an absolute value based on the percentage of the container width.
- Then add horizontal borders and padding.
Now the calculation is “preferred Width”. Then you need to calculate the minimum and maximum widths.
If the preferred width is greater than the maximum width, then the maximum width should be used. If the preferred width is less than the minimum width (the smallest unbreakable unit), then the minimum width should be used.
These values are cached for use when the layout is needed and the width is unchanged.
A newline
If the renderer needs a line break during layout, it immediately stops the layout and tells its parent that a line break is needed. The parent creates additional renderers and calls the layout on them.
draw
In the paint phase, the rendering tree is traversed and the renderer’s “paint” method is called to display the renderer’s contents on the screen. Drawing is done using user interface infrastructure components.
Global and incremental drawing
Like layout, drawing can be global (drawing the entire rendering tree) or incremental. In an incremental rendering, part of the renderer is changed, but the entire tree is not affected. The changed renderer invalidates its corresponding rectangular area on the screen, causing the OS to treat it as a “dirty area” and generate a “paint” event. OS cleverly merges multiple regions into one. In Chrome, the situation is a little more complicated because Chrome’s renderer is not on the main process. Chrome mimics OS behavior to some extent. The presentation layer listens for these events and delegates messages to the presentation root node. It then traverses the rendering tree until it finds the relevant renderer, which redraws itself (and often its children).
Drawing order
- The background color
- The background image
- A border
- Their offspring
- outline
Firefox Display list
Firefox traverses the entire rendering tree to create a display list for the drawn rectangles. The list contains renderers associated with the rectangle, drawn in the correct order (first the renderer’s background, then the border, and so on). This way, when it comes time to redraw, you only have to go through the rendering tree once instead of multiple times (draw all the backgrounds, then all the images, then all the borders, and so on).
Firefox has optimized this process by not adding hidden elements, such as elements completely obscured by opaque elements.
WebKit rectangular storage
Before redrawing, WebKit saves the original rectangle as a bitmap and then only draws the difference between the old and new rectangles.
Dynamic change
When changes occur, the browser responds as minimally as possible. Therefore, when the color of an element changes, only the element is redrawn. When the position of an element changes, only that element and its children (and possibly siblings) are laid out and redrawn. Once a DOM node is added, it is laid out and redrawn. Major changes, such as increasing the font size of “HTML” elements, will invalidate the cache, causing the entire rendering tree to be rearranged and redrawn.
Render engine threads
The rendering engine is single-threaded. Almost all operations (except network operations) are performed in a single thread. In Firefox and Safari, this thread is the main thread of the browser. In Chrome, this thread is the main thread of the TAB process.
Network operations can be performed by multiple parallel threads. The number of parallel connections is limited (typically 2 to 6, or 6 for Firefox 3).
Event loop
The main thread of the browser is the event loop. It is an infinite loop, always in the receiving processing state, waiting for events (such as layout and drawing events) to occur and be processed. Here’s the code for the main event loop in Firefox:
while(! mExiting) NS_ProcessNextEvent(thread);Copy the code
CSS2 visual model
The canvas
According to the CSS2 specification, the term “canvas” refers to “the space used to render a formatted structure,” that is, the area where the browser draws content. The canvas size is unlimited, but the browser selects an initial width based on the viewport size.
According to www.w3.org/TR/CSS2/zin. A canvas is transparent if it is contained within another canvas. Otherwise, the browser will specify a color.
CSS frame model
CSS box models describe rectangular boxes generated for elements in a document tree and laid out according to a visual format model. Each box has a content area (for example, text, images, etc.), as well as optional padding, borders, and margins areas.
Each node generates 0.. N of these boxes. All elements have a “display” attribute that determines the type of box they correspond to. Example:
block - generates a block box.
inline - generates one or more inline boxes.
none - no box is generated.Copy the code
The default is inline, but the browser stylesheet sets other defaults. For example, the display attribute of the “div” element defaults to a block.
You can find examples of default style representations here:
www.w3.org/TR/CSS2/sam…
The positioning
There are three positioning schemes:
- Normal: The object is positioned according to its position in the document, that is, its position in the rendering tree is similar to its position in the DOM tree, and is laid out according to its box type and size.
- Float: Objects are laid out in normal flow and then moved as far left or right as possible.
- Absolute: An object’s position in the rendering tree is different from its position in the DOM tree.
The positioning scheme is set by the “position” and “float” properties.
- If the values are static and relative, it is a normal stream
- If the values are absolute and fixed, the location is absolute
Static positioning does not define a location, but uses the default positioning. For other scenarios, the page author needs to specify locations: top, bottom, left, and right.
The layout of boxes is determined by the following factors:
- Box type
- The frame size
- The positioning
- External information, such as picture size and screen size
Box type
Block box: Forms a block that has its own rectangular area in the browser window.
Inline box: Does not have its own block, but is inside a container block.
Blocks are in vertical format one after another, while inline is in horizontal format.
The inline box is placed in a line or line box. These lines are at least as tall as the highest box, and can be higher, when the box is aligned according to the “bottom line,” which means that the bottom of the element needs to be aligned according to where the bottom of the other box is not. If the container is not wide enough, inline elements are placed in multiple lines. This often happens in paragraphs.
positioning
The relative
Relative positioning: first positioning in the normal way, then moving according to the desired offset.
floating
The float box moves to the left or right of the row. The interesting feature is that other boxes float around it. Here’s the HTML code:
<p>
<img style="float:right" src="images/image.gif" width="100" height="100">
Lorem ipsum dolor sit amet, consectetuer...
</p>Copy the code
The following information is displayed:
Absolute and fixed positioning
This layout is precisely defined and has nothing to do with normal flows. The element does not participate in the normal flow. Dimensions are relative to the container. In fixed positioning, the container is the visible area.
Note that the anchor box does not move even when the document is scrolling.
Hierarchical display
This is specified by the Z-index CSS property. It represents the third dimension of the box, the position along the “z axis”.
These boxes are scattered across multiple stacks (called stack contexts). In each stack, the following elements are drawn first, then the preceding elements are drawn at the top to get closer to the user. If there is overlap, the newly drawn element overwrites the previous one. The stack is sorted by the Z-index property. Boxes with the “Z-index” attribute form the local stack. The viewport has an external stack.
Example:
<style type="text/css">
div {
position: absolute;
left: 2in;
top: 2in;
}
</style>
<p>
<div
style="z-index: 3; background-color:red; width: 1in; height: 1in; ">
</div>
<div
style="z-index: 1; background-color:green; width: 2in; height: 2in;">
</div>
</p>Copy the code
The results are as follows:
Although the red div is higher in the tag than the green div (which should be drawn first in the normal flow), the Z-index attribute takes precedence, so it moves to a higher position in the stack held by the root box.
The resources
- Browser architecture
- Grosskurth, Alan. A Reference Architecture for Web Browsers (pdf)
- Gupta, Vineet. How Browsers Work – Part 1 – Architecture
- parsing
- Aho, Sethi, Ullman, Compilers: Principles, Techniques, and Tools (i.e. “Dragon Book”), Addison-Wesley, 1986
- Rick Jelliffe. The Bold and the Beautiful: two new drafts for HTML 5.
- Firefox
- L. David Baron, Faster HTML and CSS: Layout Engine Internals for Web Developers.
- L. David Baron, Faster HTML and CSS: Layout Engine Internals for Web Developers
- L. David Baron, Mozilla’s Layout Engine
- L. David Baron, Mozilla Style System Documentation
- Chris Waterson, Notes on HTML Reflow
- Chris Waterson, Gecko Overview
- Alexander Larsson, The life of an HTML HTTP request
- WebKit
- David Hyatt, Implementing CSS (Part 1)
- David Hyatt, An Overview of WebCore
- David Hyatt, WebCore Rendering
- David Hyatt, The FOUC Problem
- The W3C specification
- The HTML 4.01 specification
- The W3C HTML 5 specification
- Cascading style Sheets Level 2, 1st modification (CSS 2.1) specification
- Browser Build Instructions
- Firefox. Developer.mozilla.org/en/Build_Do…
- Its. Webkit.org/building/bu…
Excellent articles
Pay attention to our