• Micro front End 01: Js isolation mechanism analysis (Snapshot sandbox, two kinds of proxy sandbox)
  • Micro Front End 02: Analysis of microapplication loading process (from registration of microapplication to internal implementation of loadApp method)
  • Micro front End 03: Sandbox Container Analysis of Universe (concrete application of Js sandbox mechanism after establishment)
  • Microfront 04: Resource loading mechanism for The Universe (internal implementation of import-HTml-Entry)
  • Micro front End 05: Implementation of loadMicroApp method and analysis of data communication mechanism

We mentioned in the previous article Micro-front-end 02: Qiankun micro-application loading process analysis (from the registration of micro-application to the internal implementation of loadApp method) that js, CSS, HTML and other resources of micro-application should be obtained when loading micro-application, but the specific way to obtain it was not mentioned at that time. Remember that in loadApp we executed this line of code:

const { template, execScripts, assetPublicPath } = await importEntry(entry, importEntryOpts);
Copy the code

ImportEntry here comes from a dependency library import-html-entry. Starting with importEntry, we’ll explore what the entire import-html-entry does.

Let’s start with a flow chart:

Next, I will explain the important links one by one according to the flow chart sequence, please see below.

importEntry

Let’s take a look at importEntry, which parameters it accepts and what they mean:

// Code snippet 1, file: SRC /index.js
export function importEntry(entry, opts = {}) {
	const { fetch = defaultFetch, getTemplate = defaultGetTemplate, postProcessTemplate } = opts;
	const getPublicPath = opts.getPublicPath || opts.getDomain || defaultGetPublicPath;
	// Omit some less critical code...
	if (typeof entry === 'string') {
		return importHTML(entry, {
			fetch,
			getPublicPath,
			getTemplate,
			postProcessTemplate,
		});
	}
    // A lot of code is omitted here... Placeholder 1
}
Copy the code

ImportEntry: importEntry: importEntry: importEntry: importEntry:

function

  • loadingcss/jsResources, and embed the loaded resources intohtmlGo in;
  • To obtainscriptsResources on theexportsobject

type

  • Entry (parameterentryType, will pass) :
> `string | { styles? : string[], scripts? : string[], html? : String} > If the type is' string ', importHTML will be called to perform the related logic. Otherwise, resources for 'styles' and' scripts' will be loaded and embedded in the string 'HTML'. Note that this is a string. This is different from loading remote 'HTML' resources in the corresponding step **3** in the flowchart. Note also that the 'styles' argument corresponds to the' style 'resource url array, similarly, the' scripts' argument corresponds to the 'js' resource URL array. The argument 'HTML' is a string, the content of an 'HTML' page.Copy the code
  • ImportEntryOpts(parameteroptsType, optional) :
    • fetch: User-defined method for loading resources. The type is optionaltypeof window.fetch | { fn? : typeof window.fetch, autoDecodeResponse? : boolean }, in which theautoDecodeResponseOptional when character set is notutf-8(e.g.,gbkorgb2312), the default value isfalse.
    • getPublicPath: the type of(entry: Entry) => string, user-defined relative path for accessing resources. This parameter is optional.
    • getTemplate: the type of(tpl: string) => string, customHTMLResource preprocessing function, optional.

At this point, I’m sure you’ll be able to understand the function arguments in code fragment 1. Next, I’ll go to importHTML.

importHTML

Before entering importHTML, importHTML takes the same parameters as importEntry. Without further details, we will look at the overall structure of importHTML.

// Code snippet 2, file: SRC /index.js
export default function importHTML(url, opts = {}) {
	// A lot of code is omitted here... Placeholder 1
	return embedHTMLCache[url] || (embedHTMLCache[url] = fetch(url)
		.then(response= > readResAsString(response, autoDecodeResponse))
		.then(html= > {
			const assetPublicPath = getPublicPath(url);
			const { template, scripts, entry, styles } = processTpl(getTemplate(html), assetPublicPath, postProcessTemplate);
			return getEmbedHTML(template, styles, { fetch }).then(embedHTML= > ({
				template: embedHTML,
				assetPublicPath,
				getExternalScripts: () = > getExternalScripts(scripts, fetch),
				getExternalStyleSheets: () = > getExternalStyleSheets(styles, fetch),
				execScripts: (proxy, strictGlobal, execScriptsHooks = {}) = > {
					if(! scripts.length) {return Promise.resolve();
					}
					return execScripts(entry, scripts, proxy, {
						fetch,
						strictGlobal,
						beforeExec: execScriptsHooks.beforeExec,
						afterExec: execScriptsHooks.afterExec, }); }})); })); }Copy the code

We’ve omitted some of the code for ease of understanding, but placeholder 1 in snippet 2 preprocesses the parameters passed in.

Here is a simple way, embedHTMLCache [url] | | (embedHTMLCache [url] = fetch (url) that use the cache and cache assignment way, in the daily development can draw lessons from.

ImportHTML has three core functions:

  • callfetchrequesthtmlResources (note, noJs, CSS,Resources);
  • callprocessTplProcessing resources;
  • callgetEmbedHTMLrightprocessTplThe remote link in the processed resourceJs, CSS,Resources are fetched locally and embedded intohtmlIn the.

As for requesting HTML resources, this is mainly the call to the FETCH method, which is not described too much here. Let’s focus on processTpl and getEmbedHTML.

processTpl

I’m not going to go through the processTpl code line by line. Instead, I’m going to talk about one of the points that shouldn’t be important, and that’s the regular expressions involved, which, while seemingly basic, is actually the key to understanding the function processTpl. In the code snippet below, I’ll comment on what each regular expression might match and describe the main logic as a whole. With this introduction, I’m sure you can read the rest of the function code for yourself.

// Snippet 3, file: SRC /process-tpl.js
/* Matches the entire script tag and its contents, such as  or  [\s\ s] matches all characters. \s matches all whitespace, including newlines, \s non-whitespace, excluding newlines * Matches the preceding subexpression zero or more times + Matches the preceding subexpression one or more times the global flag g following the regular expression specifies that the expression be applied to as many matches as can be found in the input string. The case insensitive I flag at the end of the expression specifies case insensitive. * /
const ALL_SCRIPT_REGEX = /(
      [\s\s]*?>;
/*. Matches any single character other than newline \n? Matches the preceding subexpression zero or once, or indicates a non-greedy qualifier. The parentheses will have the side effect that relevant matches will be cached when available? : Remove this side effect before the first option. Among them? : one of the non-capturing elements, the other two non-capturing elements are? = and? ! ,? = is a forward lookup, matching the search string at any point where the regular expression pattern inside the bracket begins matching,? ! Matches the search string at any position where the regular expression pattern does not initially match. Example: exp1 (? ! Exp2) : find exp1 not followed by exp2. So the real meaning here is to match the script tag, but type cannot be text/ng-template */
const SCRIPT_TAG_REGEX = /<(script)\s+((? ! type=('|")text\/ng-template\3).) *? >. *? <\/\1>/is;
/* * Matches a script tag containing a SRC attribute ^ matches the beginning of the input string, but when used in a square bracket expression, it does not accept the set of characters in that square bracket expression. * /
const SCRIPT_SRC_REGEX = /.*\ssrc=('|")? ([^>'"\s]+)/;
// Matches the tag containing the type attribute
const SCRIPT_TYPE_REGEX = /.*\stype=('|")? ([^>'"\s]+)/;
// Matches the tag with the entry attribute //
const SCRIPT_ENTRY_REGEX = /.*\sentry\s*.*/;
// Matches the tag with the async property
const SCRIPT_ASYNC_REGEX = /.*\sasync\s*.*/;
// Matches the backward compatible nomodule tag
const SCRIPT_NO_MODULE_REGEX = /.*\snomodule\s*.*/;
// Matches tags containing type=module
const SCRIPT_MODULE_REGEX = /.*\stype=('|")? module('|")? \s*.*/;
// Matches the link label
const LINK_TAG_REGEX = /<(link)\s+.*? >/isg;
// Match tags with rel=preload or rel=prefetch. Tip: rel is used to specify the relationship between the current document and the linked document, such as rel= "icon"
const LINK_PRELOAD_OR_PREFETCH_REGEX = /\srel=('|")? (preload|prefetch)\1/;
// Matches the tag with href attribute
const LINK_HREF_REGEX = /.*\shref=('|")? ([^>'"\s]+)/;
// Matches the tag with as=font
const LINK_AS_FONT = /.*\sas=('|")? font\1.*/;
// Match the style tag
const STYLE_TAG_REGEX = /
      [^>;
// Match the tag of rel=stylesheet
const STYLE_TYPE_REGEX = /\s+rel=('|")? stylesheet\1.*/;
// Matches the tag with href attribute
const STYLE_HREF_REGEX = /.*\shref=('|")? ([^>'"\s]+)/;
// Match comments
const HTML_COMMENT_REGEX = / <! --([\s\S]*?) -->/g;
// Matches the link tag with the ignore attribute
const LINK_IGNORE_REGEX = /<link(\s+|\s+.+\s+)ignore(\s*|\s+.*|=.*)>/is;
// Matches the style tag with the ignore attribute
const STYLE_IGNORE_REGEX = /<style(\s+|\s+.+\s+)ignore(\s*|\s+.*|=.*)>/is;
// Matches the script tag with the ignore attribute
const SCRIPT_IGNORE_REGEX = /<script(\s+|\s+.+\s+)ignore(\s*|\s+.*|=.*)>/is;
Copy the code

Knowing these regular matching rules will prepare us for the next analysis. Since the source code is rich in processTpl, I will replace the actual code in the source code with my comments for ease of understanding.

// Code snippet 4, file: SRC /process-tpl.js
export default function processTpl(tpl, baseURI, postProcessTemplate) {
    // A lot of code is omitted here...
    let styles = [];
	const template = tpl
		.replace(HTML_COMMENT_REGEX, ' ') // Delete the comment
		.replace(LINK_TAG_REGEX, match= > {
                // A lot of code is omitted here...
                // If the link tag has an ignore attribute, replace it with a placeholder '<! -- ignore asset ${ href || 'file'} replaced by import-html-entry -->`
                // If the link tag does not have an ignore attribute, replace the tag with a placeholder '<! -- ${preloadOrPrefetch ? 'prefetch/preload' : ''} link ${linkHref} replaced by import-html-entry -->`
		})
		.replace(STYLE_TAG_REGEX, match= > {
                // A lot of code is omitted here...
                // If the style tag has an ignore attribute, replace the tag with a placeholder '<! -- ignore asset style file replaced by import-html-entry -->`
		})
		.replace(ALL_SCRIPT_REGEX, (match, scriptTag) = > {
                // A lot of code is omitted here...
                // There is a lot of code here, but it can be summarized as matching regular expressions and replacing them with placeholders
		});

	// Omit some code here...
	let tplResult = {
		template,
		scripts,
		styles,
		entry: entry || scripts[scripts.length - 1]};// Omit some code here...
	return tplResult;
}
Copy the code

As you can see from the code above, a tplResult object is eventually returned after the corresponding label is replaced with a placeholder. Scripts and styles in this object are arrays that hold links (i.e. the original href value of the tag replaced by the placeholder).

If you move your implementation to code snippet 2 of this article, you’ll find that the getEmbedHTML function that we’ll examine next is called.

getEmbedHTML

Let’s start with the getEmbedHTML function: getEmbedHTML

function getEmbedHTML(template, styles, opts = {}) {
	const { fetch = defaultFetch } = opts;
	let embedHTML = template;

	return getExternalStyleSheets(styles, fetch)
		.then(styleSheets= > {
			embedHTML = styles.reduce((html, styleSrc, i) = > {
				html = html.replace(genLinkReplaceSymbol(styleSrc), `<style>/* ${styleSrc}* /${styleSheets[i]}</style>`);
				return html;
			}, embedHTML);
			return embedHTML;
		});
}

export function getExternalStyleSheets(styles, fetch = defaultFetch) {
	return Promise.all(styles.map(styleLink= > {
			if (isInlineCode(styleLink)) {
				// if it is inline style
				return getInlineCode(styleLink);
			} else {
				// external styles
				return styleCache[styleLink] ||
					(styleCache[styleLink] = fetch(styleLink).then(response= >response.text())); }})); }Copy the code

GetEmbedHTML actually does two main things. ProcessTpl: processTpl: processTpl: processTpl The second is to assemble the content into style tags and replace the placeholders in processTpl.

Back in code snippet 2 of this article, getEmbedHTML returns a Promise that will eventually resolve an object:

{
    template: embedHTML,
    assetPublicPath,
    getExternalScripts: () = > getExternalScripts(scripts, fetch),
    getExternalStyleSheets: () = > getExternalStyleSheets(styles, fetch),
    execScripts: (proxy, strictGlobal, execScriptsHooks = {}) = > {
        if(! scripts.length) {return Promise.resolve();
        }
        return execScripts(entry, scripts, proxy, {
            fetch,
            strictGlobal,
            beforeExec: execScriptsHooks.beforeExec,
            afterExec: execScriptsHooks.afterExec, }); }}Copy the code

The most important properties of this object are template and execScripts. Template represents the content of the page (HTML/CSS), and execScripts are related to the script that the page needs to execute. Let’s take a look at exeecScripts internal implementation.

execScripts

I’ll still omit most of the code and comment it out for the sake of description.

export function execScripts(entry, scripts, proxy = window, opts = {}) {
	// A lot of code is omitted here...
	return getExternalScripts(scripts, fetch, error)// And get the js resource link corresponding content
		.then(scriptsText= > {
			const geval = (scriptSrc, inlineScript) = > {
				// A lot of code is omitted here...
                // The js code is processed and assembled into a self-executing function, which is then executed using eval
                // The key here is to call getExecutableScript and bind window.proxy to change this reference in js code
			};

			function exec(scriptSrc, inlineScript, resolve) {
				// A lot of code is omitted here...
				// Call geval at different times to execute js code according to different conditions, and return the object containing the micro-application lifecycle function exposed after the entry function is executed
				// A lot of code is omitted here...
			}

			function schedule(i, resolvePromise) {
                // A lot of code is omitted here...
                // Call the exec function in turn to execute the code corresponding to the JS resource
			}

			return new Promise(resolve= > schedule(0, success || resolve));
		});
}
Copy the code

At this time, we will move the implementation to code fragment 1, there is a placeholder 1 in the comments, the logic here corresponds to step 6 to step 8 in the flow chart, with the previous basis, this part of the logic friends can read by themselves, if you have questions, you can put forward in the comment area. At this point, we basically have a clear understanding of the main logic of the import-HTml-Entry library. Friends can open the source code of the project in the editor while reading the article, so that it is easier to understand and study the details together.

Please follow my wechat subscription number: Yang Yitao to get the latest news.

After reading this article, feel the harvest of friends like it, can improve the digging force value, I hope to become an excellent writer of digging gold this year.