Blast, 2015/03/26 10:04

preface


Series will briefly introduce the relevant content of IE, the space and their own cognition is limited, which is bound to have deficiencies, if there is a wrong place to write welcome to point out.

Article layout: each article contains a maximum of 3 parts, these 3 parts are related content, but not necessarily the same series of things, please forgive me.

The first part is usually background

The second part is the summary description

The third part is detailed introduction or practice.

I.1 Historical Changes in Internet Explorer


In memory, Microsoft has released eight different versions of Internet Explorer (if Spartan is Internet Explorer 12) since 1999, when it got to the Web, with Internet Explorer 4 coming with Windows 95. What changes have Internet Explorer made? Milestones can be counted:

·IE1, IE2 (1995) : The simplest “browser” in the family. It only supports static pages. It doesn’t support many of the functions you can use today. He had a prototype for many of the functions you use now.

·IE3 (1996) : Improvements over earlier versions, support for ActiveX controls, support for JavaScript and VBScript (then called Microsoft JScript and Microsoft VBScript, for no reason, because of trademark issues). Since then, the well-known ActiveX control WebBrowser has been supported, which ensures browser reusability.

·IE4 (1997) : Introduced DHTML functionality, data binding support, and enhanced WebBrowser functionality, adding new features, sidebars, and BHOs.

·IE5 (1999) : With the release of Windows 98, persistent session support was followed by XMLHttpRequest, which led to the development of AJAX (although AJAX wasn’t even invented yet…). , the introduction of HTA, and automatic form filling and other functions. Internet Explorer 5.5 also supports 128-bit encryption.

· Internet Explorer 6 (2001) : Released immediately after Windows XP and probably the most impressive browser, Internet Explorer 6 reached 90% market share in 2002-2003 and the IE family reached 95% market share. It is also one of the most maligned browsers because of its security problems. Most of the additions in this release are partial support for web rendering, such as CSS1, DOM1, etc.

· Internet Explorer 7 (2006), Internet Explorer 8 (2009) : New versions of The browser released by Microsoft after Internet Explorer 6 lost market share to Firefox are mostly performance tweaks and rendering tweaks and enhancements.

·IE9 (2011), stable version, improved performance and HTML5 support, multi-process support, which allows web pages to freeze or crash without affecting other pages.

No screenshots, these versions are easy to find after IE9.

·IE10 (2011) /IE11 (2013), the performance of the larger enhancement, as well as rendering and compatibility enhancement, increase DNT support, IE10 performance really has been and before have a larger difference, but the name of the IE, or bear the profound impact of IE6.

·Spartan (2015), code name IE12. With the integration of voice assistants and performance enhancements, Microsoft may be ditchin ‘IE brand and starting from scratch. At least it does use a different set of DLL libraries from IE.

I.2 Composition of IE


In the early single-process IE, the structure of IE looked like this:

Figure: Single process tab-free mode represented by IE6

With the introduction of multi-process, the structure of the web section of IE is still similar, but the interface host has changed and looks like this:

Note that the shell pages in the figure above belong to different processes.

In IE7, you can run a set of web Windows in a process, but a new window does not mean it is running in a new process. (For example, if you Ctrl+N to create a new window, it is actually created in the current process), you can install IE7 to try it out, if you have not seen this description will be strange. Processes in protected mode run at a low integrity level and communicate through a proxy process.

The simplified process model is as follows:

Figure: Process mode in IE7

In Internet Explorer 8, Microsoft introduced the IE8 Loosely coupled Process Framework (LCIE), which uses Jobs to limit process permissions. At this point, the structure of Internet Explorer 8 with protected mode enabled and without protected mode enabled is similar to:

You can see that in this version of IE the UI Frame and some administrative functions are running at the medium integrity level, while in protected mode the Tab and web processes are running at the low integrity level (the domain with protected mode disabled is still medium integrity level).

As mentioned above,HTML and ActiveX controls are in the web process, and a special feature is the toolbar, which is also in the web process.

What are the benefits of adopting this model? The first is that each TAB is independent, so if one TAB collapses it doesn’t affect the others. The reason for moving the UI Frame to the agent process is to speed up startup.

In addition, the page-process mode is adopted, so pages and tabs of different integrity levels can belong to the same UI Frame, and it is easier to manage. If you have ever used Morden IE, Metro IE, you may have noticed that the web processes are 64-bit because they don’t load any plugins in this version. Even though many plugins are now available in 64-bit versions, such as Adobe Flash Player, the obsession with 64-bit still leads to plugin incompatibilities. Therefore, in the 64-bit version of IE10 and IE11, the UI Frame of the browser runs in 64-bit, while the web process still uses the 32-bit process by default to ensure compatibility of plug-ins. Even if you open a 64-bit Internet Explorer, you end up with a 32-bit web page.

So maybe you’ll see both 64-bit and 32-bit IE on your computer:

And, after starting IE, one 64-bit process and n 32-bit processes appear:

Figure: IE11 64-bit Frame process and 32-bit Content process

In contrast to IE7’s mode, you can see that in IE11, even if you manually run iExplore http://www.wooyun.org twice, you end up with a single 64-bit UI process.

Figure: Process mode in IE11

Of course, if you enable enhanced protection mode, the web process will also become 64-bit.

Figure: Enhanced protection mode enabled in Internet Explorer 11

On Windows 7, the only use of this mode is to make the process 64-bit, but On Windows 8, the process isolation mode of AppContainer is introduced. See Resources (1) for details.

Limited by space, related content will be described later.

I.3 Key Concepts: What is Markup Service?


Back to the core function of IE, as a webpage renderer, the hypertext Markup language HTM(Markup)L must be inseparable from Markup, then what is this Markup on earth? Historically,Markup is actually for the actors. Simply speaking, it is the script. Usually, a blue mark is drawn on it to indicate who should play it and how it is appropriate. As for the Markup Service content, it is recommended that the big family only know about it at the beginning.

Markup Script. Of course, this is for the actors. The Image is from Google Image

Figure :IE can recognize Hyper Text Markup Langauge

For example, an HTML file might have the following content:

#! html <DIV>blast<DIV>offCopy the code

When the browser parses the text, the browser standardizes the content (as I prefer to call it) so that the DOM content looks something like this:

#! HTML  < HTML > < HEAD > < TITLE > < / TITLE > < / HEAD > < BODY > < DIV > blast < DIV > off < / DIV > < / D IV > < / BODY > < / HTML >Copy the code

This process you can go to the webpage DOM to see:

Figure: Document standardization in IE11

This feature may introduce additional security risks due to element inserts, as I posted earlier:

http://wooyun.org/bugs/wooyun-2010-033834
Copy the code

Alternatively, the parser, after this round, turns the HTML text into an element. In order to complete the content, some elements that were not originally added, such as HTML, head, title, and body are automatically constructed by the parser.

At the same time, when the parser encounters a second div, it automatically wraps the first div(how closed depends on the browser implementation). And tags that were necessary (but not written) before, such as < HTML > and , are automatically appended and closed by IE.

The second concept to note is the difference between a tree and a stream. For example:

#! html This <B>is</B> a testCopy the code

This set of examples of “this is a test” and a pair of B tags will be converted into the following tree. Text is treated as a leaf and Element as an inner node.

        ROOT
          |
  +-------+--------+
  |       |        |
"this"    B    "a test"
          |
         "is"
Copy the code

After converting a document to a tree, all operations become tree-like operations, such as adding and removing child nodes. The API that provides such operations is called the Tree Service.

Of course, since IE4.0, element model manipulation is much more powerful than simple trees, as in this example:

#! html An <B>exmaple <I> of </B> elements </I> crossCopy the code

The B and I ranges cross each other, which is common in HTML but difficult to describe in trees. Therefore, Markup Services no longer provides treelike operations on this content, but instead exposes a flow-based model for easy control of the content.

Figure: Intersecting ranges

Therefore, the Markup Service is actually used to avoid creating this confusing layer between model layers.

When Tree Service is unavailable, the browser turns to Markup Service to control the flow-based operation model.

In the tree-based model, web content is treated as nodes of the tree, with each element, or chunk of Text, being a node. Nodes operate in tree-like ways, such as adding or deleting a child node from its parent.

In the way content is manipulated in a flow-based model (such as through Markup Services), the content of a document is manipulated by using iterator-like objects. Just as in the example above where elements cross, these partially overlapping elements are distinguished by two Markup Pointers, each specifying where the Tag starts and ends. So, the flow-based model is a superset of the tree-based model.

With that said, it’s time to introduce our Markup Pointer. In C++, for example, it is very convenient to use iterators if you want to manipulate a vector:

Figure: Insert an element into a vector using an iterator

As you can see, the Markup Pointer also has a bit of a magic iterator. You can understand this by looking at the process of creating and manipulating invalid documents.

Note the previous “This is a test” example; browsers may not even consider This a valid HTML document.

A minimum valid HTML document should have at least four elements: HTML, head, title, and body. When these elements are not present in your content, the parser automatically sets them up and places them in place.

During document parsing, the Markup Service can be used to delete or rearrange the DOM. For example, you can delete HTML, body elements in their entirety. You can move the head inside the body (but doing so will treat the document as invalid).

In IE, there are many classes to provide this service. The most common class is CMarkup. The Markup pointer class responsible for “pointing to elements and regions” is named CMarkupPointer, which is derived from CBase.

If you have pay attention to similar content before sending a CMarkupPointer null pointer reference questions are related to this (http://wooyun.org/bugs/wooyun-2010-079690).

There are a few things to note about CMarkupPointer, which can cause IE to crash or other errors,

Respectively is:

Markup Pointer is unpointed when it is newly created, or when it is created as a parameter of an invalid object constructor, that is, it refers to nothing. Usually this value is 0, which may result in a null Pointer reference.

(2)Markup Pointer sets the Pointer stickiness (when the area where the current Pointer is located moves, whether the finger in the area also calculates the new position). If gravity is also set (gravity is divided into left and right gravity, In simple terms, it is to insert a content at the Pointer, and after the operation, whether the Pointer should be attached to the content on the left or the content on the right), and ambiguity occurs after some operations. After the process of moving and deleting the part pointed by the Markup Pointer,Markup Pointer It might go back to the undirected state. This is because the object to which the pointer points does not exist or is invalid. The pointer has been removed from the document, but the pointer itself has not been deleted. If the pointer is reused later without validation, it may cause problems.

(3) Error may occur when Markup Pointer is moved around.

As a rule of thumb,IE code relies heavily on upper-level validation checks, so once the lower-level code receives invalid data,IE is likely to have exceptions.

Once again, I will mention the content of Markup Service. If I have not contacted with it before, I will only know about it in the beginning. Later, when I know more about IE, it will be easy to connect this part with other parts. To be honest with you, it was a bit of a headache at first… So look at more confused words do not care too much, have an impression on it.

The resources


(1) Tencent Antivirus Lab: In-depth analysis of the working mechanism of AppContainer

(2) Q&A: 64-Bit Internet Explorer

(3) Windows 8 Metro/Modern Style IE 10

(4) Enhanced Memory Protections in IE10