1. The background

In small business development program, often will encounter all sorts of “configuration” requirements, such as details of commodity display, membership rights, the promotion of the ground articles, etc., these requirements are no fixed template, need to operating personnel in side B use the rich text editor custom configuration content, and then from the small program to display.

This requires the small program to provide enough string parsing ability, and the small program also provides the rich-text component to meet the requirements of rich text rendering. However, it does not support partial semantic tags, interaction, event clicks, audio and video labels, and other shortcomings, leading to the limited scope of use.

Github also open-source several applets rich text components, of which MP-HTML is the most widely used due to its powerful functions.

However, in different business development, developers use different rich text editors on the B-side, and the emphasis on the use of rich text tags is also slightly different, which involves the optimization of all aspects of the performance of small programs. Therefore, mp-HTML components are not ready-to-use and need to be customized to meet specific business requirements. Therefore, it is particularly important to understand the rendering principle of rich text components and to master the optimization methods of various rich text tags, which is also the focus of this paper.

2. The officialrich-textcomponent

According to the applet developer documentation, the rich-Text component supports passing in HTML String and Nodes array formats. As shown in the figure below, to render rich text with the rich-text component, pass in either the HTML String on the left or the Nodes on the right.

But there is no “I all want” blessing in the world, there must be someone behind the silent load forward. According to official Tips, the String type is not recommended for Nodes, which may degrade performance. Presumably, the rich-Text component encapsulates an HTML string parser that parses the HTML string into a manageable data structure, called Nodes.

Indeed, according to the article LLParser: High performance Parser Generator in Web environment and application to ASM.js “, we learned that the small program component system kernel ExParser encapsulated a LLParser generator, can generate a variety of string parsers according to the syntax definition. One of the most important parsers produced by LLParser is the HTML parser, which is used for the rich-text components of small programs and some precompile-time analysis. The HTML parser parses the HTML string into a manageable data structure for the rich-Text component to safely filter.

The data structure of nodes is as follows:

attribute instructions type mandatory note
name Tag name string is Partially trusted HTML nodes are supported
attrs attribute object no Partially trusted attributes are supported, following Pascal nomenclature, such asstyle,class
children Child node list array no The structure is the same as nodes

For security reasons, the HTML parser parses only partially trusted nodes, and if an untrusted HTML node is passed in as an HTML string string, that node and all its children are removed. This causes the rich-text component to not support partial semantic tags, causing some rich-text content to be lost (Defect 1).

During the parsing, the HTML parser adds only some trusted attributes to the corresponding HTML tag, such as style and class. Other attributes are filtered out and id is not supported. This causes the rich-Text component to block interactions, such as image previews, link jumps, anchors, event clicks, and so on (Defect 2).

Due to the limitations of the data structure of Nodes, the IMG tag in the rich-text component only supports network images, not Base64, and does not support SVG. For the table tag, only the width attribute is supported, and for tables with nested complex tags, the combined table of cells is helpless. For security reasons, the rich-text component filters out all media labels, such as video and audio labels, which greatly limits the rich-text function (Defects 3, 4, and 5).

3. wxParsecomponent

As mentioned above, the small program component system kernel ExParser encapsulates an LLParser generator (C +javascript), and the HTML parser derived from this can parse HTML strings into data structures (Node trees) that are easy to understand by code. This is a bit like how a browser parses an HTML document, parsing it into a DOM tree. How Browsers Work: Behind the Scenes of Modern Web Browsers explains the HTML Parser process in detail. Get started with this article.

This inspired us to use the mature HTML Parser in the open source community to parse the HTML string into the expected data structure, namely Node Tree, and then render the Node Tree iteratively into various tags supported by the applets.

The early popular wxParse component played this way. The wxParse component parsed an HTML string into a Node tree (an array of nodes) by regular matching, then iterated through the set of numbers, rendering each HTML tag into a corresponding applet tag.

But it also has three fatal flaws:

First, it has low fault tolerance. The parsing script of the wxParse component is parsed through regular matching. Once the wrong string does not meet the matching rules, it is parsed as text, resulting in an error.

In order to achieve the optimal balance between performance and accuracy, the wxParse component sets a maximum number of layers. When the number of node tree layers to be rendered exceeds the maximum number of layers, all child node labels under that level are discarded and cannot be displayed, resulting in the loss of display content.

Third, functional limitations. WxParse has poor support for table, OL, UL and so on. Similar to table cell merging, ordered list and multi-layer list, WxParse cannot render. For audio and video rendering also performed helpless.

4. mp-htmlcomponent

4.1 Implementation Scheme

With the help of the HTML Parser, we can convert the HTML string into a node tree, a data structure we want, and then iterate over the tags for each node. However, when the node tree level is too deep, the performance challenge for applets can be severe.

In view of this, the authors of the MP-HTML component propose a two-pronged solution: iterate through the Node tree, and every time you reach a node, query all the child nodes of the node to see if they contain images, audio and video, links, tables, lists, and other tags that need to be “customized”. If not, the node and all its children are rendered directly with the rich-text component. If it does, it renders the label with View and continues to traverse its children.

As shown in the figure below, we iterate through the node div at the first level. Since the descendants of this node contain img and A tags that need to be “customized”, we render the div tag as a View tag, and then continue to iterate through its children. Iterate to the first node h1 of the second layer. Since the descendant nodes of this node do not contain tags that need to be customized, they are directly rendered by the rich-text component. All descendants of the second node P also do not contain tags that need to be “customized” and are also rendered with the rich-text component. Iterate to the third node P, whose child nodes include img and A, which need to be “customized”, then render P label as View label, then iterate over its child nodes, render IMG label as image label supported by the applet, and implement a label with the method supported by the applet.

Through this scheme, it can be intuitively found that the number of iterations is reduced and the number of rendered labels is reduced, which can significantly improve the rendering efficiency.

4.2 htmlparse2 principle

At the bottom of mp-HTML component, HTMLParser2, an HTML parser written by JS language with the best performance on the market, is used to parse rich text strings.

It is understood that the LLParser generator encapsulated by the kernel of the small program component system ExParser is better than HTMLParser2 in performance comparison, but the core algorithm of LLParser generator is implemented in C language, which is not applicable, and it does not optimize the HTML parsing specifically. Htmlparser2 is the top flow HTML parser in the industry, after many iterations, and written in Javascript, completely in line with the development of small program rich text rendering components of the business scene.

Htmlparser2 contains lexer and Parse classes. Lexer translates input into legitimate tokens, such as start tags, end tags, attribute names, and attribute values. Parser analyzes string structure according to syntax rules and constructs node tree.

The lexer parses strings in a state machine manner. Each state recognizes one or more characters, and the next state is updated based on the result of the character, with each step affected by the current state of the recognized character and Tree Construction.

The parsing process is repeated, and usually the Parser requests a token from the Lexer and tries to match the token to a grammar rule. If a match is made, the node corresponding to the token is added to the Node tree, and the Parser requests the token from the Lexer.

Of course, if the tokens don’t match a rule for a while, the Parser keeps the tokens on the stack and continues to request other tokens until a rule matches the tokens stored internally. In plain English, unclosed labels are stored on the stack, and closed labels are removed from the stack. If no match is found, the Parser throws an exception, which means that there is a syntax error in the document that can be handled in the corresponding hook function.

It can be seen that the parser class is mainly responsible for the construction of Node Tree, so during the construction of Node Tree, parser exposes many hook functions of different states. The bottom layer of mp-HTML component is the secondary transformation of HTMLParser2 package. Customize logic processing for different tags in different hook functions to get the data structure you want.

To learn more about how HTMLParse2 works, check out html-parser.

4.3 mp-htmlThe subtleties of the components

4.3.1 Reduce the number of render nodes

For example, how to determine whether the child nodes under a node contain images, audio and video, links, tables, lists and other tags that need to be customized? The principle used here is that unclosed tag nodes are stored on the stack during parser parsing. That is, when parsing to a tag that needs to be “customized”, all the nodes stored in the stack are the parent and ancestor nodes of the tag. All tags in the stack are marked with a tag (such as the continue attribute). Indicates that the descendants of these nodes or themselves contain tags that need to be “customized”.

As shown in the figure above, when traversing the IMG tag that needs to be “customized,” mark both parent and ancestor nodes p->div with the continue attribute true. And of course the same is true for the A tag.

In this way, iterating through the Node tree, if the continue property of the node is false, then the descendant nodes of the node do not contain tags that need to be “customized” and are rendered directly with template. Instead, render with the View tag and continue iterating through.

4.3.2 Optimize traversal hierarchy

As mentioned earlier, the continue attribute is used to determine whether to render the node with the View component and continue iterating. Nodes that do not continue to iterate can be rendered either as “custom” tags or as rich-text components, which can be abstracted as a template.

Encapsulate a use utility function that determines whether the tag is rendered with the View tag and continues iterating, or rendered with template. The judgment logic is as follows:

  1. For child nodes that have tags that need to be “customized,” the node is rendered asviewTag, continue iterating.
  2. If no child node exists or the current label isaThe label,templateRender as “custom” tags.
  3. If the current node has an inline label or has an inline style, it cannot be used directlyrich-textComponent rendering (rich-textComponent does not support inline style result), is usedviewRender the tag and continue iterating. On the contrary,templateTo renderrich-textComponents.

The template implementation logic is shown in the figure below. Custom render img, BR, A, video, audio and other tags as well as text, and use rich-text components to render other tags.

Following the logic of the previous hierarchy traversal, we can implement the code as follows: use the wx:for hierarchy to traverse the node, use the use function to determine whether the node is rendered using template, if not, use the view tag to render, and iterate through the next hierarchy. Note that each template is passed the I attribute, which can be bound to the corresponding tag via data-i, such as the first layer is i1, the second layer is i1_i2, and so on. This attribute can be used in logical layer event handlers. Through the e.c. with our fabrication: urrentTarget. Dataset. I get to the current event trigger a node index, traverse the node tree to get to the corresponding node, thus reducing the render layer and logical layer data consumption.

However, the depth of the parsed Node tree is not controllable, and if you write even 20 levels of hierarchical traversal, you may “discard” many nodes. In order to break through the hierarchy limitation, the rich text node rendering component name is named node component by using the idea of function recursion. When traversing the 5th layer, if template still cannot be used for rendering, then reference its own component Node rendering.

4.3.3 Do not trust label rendering

In addition, as mentioned earlier, the rich-text component resolves to a node that is not trusted, and that node and all its children are removed, resulting in a loss of rich text content for rendering.

So how to solve this problem?

Untrusted tags can be converted from the onCloseTag hook function within the Parser process to display as much text as possible. For example, convert untrusted block-level tags like Address, article, aside, body, Caption, Center, cite, footer, header, HTML, nav, pre, section, etc., to div tags. For other untrusted tags, convert to span inline tags to display the text as fully as possible.

In addition, MP-HTML defines a number of hook functions on the prototype of parser class, as shown in the figure below. Among them, onOpenTag and onCloseTag functions do a lot of exquisite logical processing for the tags that need to be customized. The author will elaborate on the optimization principle of each “customized” tag in the following introduction.

The function name instructions Main processing logic
onTagName Resolving tag names Convert to lowercase to solve the case insensitive problem
onAttrName Resolving attribute names To deal withdata-And convert to the corresponding attribute name
onAttrVal Parsing property values Some attribute entities decode and splice domain names
onText Parsing the text Merge whitespace, entity decoding
parseStyle Parsing style sheets Convert RPX units, convert width, height, etc
onOpenTag Parsing to the start of the tag Customize logic for different tags, add required attributes, and push
onCloseTag Parse to the end of the label Different tag customization logic processing, conversion properties, out of the stack

4.3.4 Style Settings

As mentioned in the previous two sections, the rich-text component does not support inline styles, such as:

In this case, although display:inline-block is set to the top div in rich-text, the effect of inline elements cannot be achieved without rich-text itself being set. Similar cases include float, width(when set to percentage), and so on.

The solution is to extract the display, float, width, and other styles of the top tag from the Parser process and put them in the style of the Rich-Text component.

So the problem is, we parse the style property as a string like style=’display:inline-block; padding:10px; color:#ff00ff; ‘, how to resolve the problem of style attribute name conflicts?

This is where the {key: value} data structure comes in handy. In the parseStyle hook function, the style attribute is matched in a state machine-like manner, converted to a {key: value} data structure, overwritten for duplicate style names, and converted to RPX units that are not recognized by the rich-Text component.

So how do you implement child node style enhancement?

When onCloseTag is resolved to the tag closed hook function, the operation is performed on the specific node, and the parent/ancestor node of the node is stored in the stack. When the node encounters a specific style, the corresponding operation is performed on the style attribute of the element in the stack, because {key: Value} data structure, so it can effectively solve the problem of style overwriting. Before the node exits the stack, the {key: value} data structure is converted into a string and assigned to the attrs.style property of the node.

In this way, not only does the styling problem get solved, but the parent node’s improper styling can be corrected in time. In the hook function that onOpenTag parses to the start of the tag, for example, for img tags, if the tag style exists flex:1 and the width is not set, then the width of the style should be set to 100%! Important, also parent/ancestor nodes should not contain inline styles.

5. Img tag rich text rendering

5.1 Problems

The img tag passed by the rich-text component only supports class, style, Alt, SRC, width, and height attributes, according to the applet developer documentation.

There are several problems with rendering img tags using the rich-text component:

  1. Does not support interaction, including picture preview, picture long press save.
  2. Does not support image zooming and compression. Downloading the original image slows image loading and occupies bandwidth resources.
  3. Lazy loading is not supported. Performance is affected when there are too many images.
  4. Only network images are supported. SVG and Base64 are not supported.
  5. Poor user experience. The label is not displayed during image loading, and the image loading failure shows that the picture is cracked.
  6. Poor mobile adaptation. The image width is larger than the logical screen pixels, will overflow the screen. Such as setting the width adaptive, easy to cause the picture distortion.

5.2 Optimization Scheme

The implementation principle is through htmlParse2 to parse into nodes node data structure, and then iterate through the rendering of each node tag, encountered the tag named IMG tag, rendering custom template.

The onOpenTag parser is parsed to the function at the beginning of the tag. If the tag is parsed to the IMG tag, the following steps are performed:

  1. This is currently on the stackimgThe parent/ancestor node of the tag, becauseimgTags are tags that require custom rendering, traversing all parent/ancestor nodes in the stack, tagscontinueProperties fortrue. And traverse their patterns, such as encounterflex,inline-blockStyles need to be modifiedimgThe style of the tag itself, and the style fixes for the parent/ancestor nodes.
  2. Maintain aimgListArray for images to click preview each time traversing to oneimgLabel, the label of thesrcattributepushInto theimgListThe array. Call due to previewwx.previewImageThe incomingcurrentProperty represents the link to the current image, so rich text can cause multiple images with the same linkPictures of dislocation. The solution is in picturespushInto theimgListArray before, determine whether to repeat, if repeated, the image domain nameRandom capitalization.
  3. judgeimgWhether the label comes inwidth,heightProperties. When a valid value exists for both,modeusescaleToFill(In zoom mode, the image is scaled without maintaining aspect ratio, so that the width and height of the image are completely stretched to fill the image element), otherwise this defaults towidthFix(Zoom mode, width unchanged, height automatically change, keep the original image width to height ratio unchanged). Also, because the component is set upmax-widthPocket bottom, when the image is set to a width beyond the screen, for forced changemax-width, then according toscaleToFillRegular scaling will distort the image. So when the width of the image exceeds the screen, remove the height and let it presswidthFixRegular scaling.

In the onCloseTag function of the Parser process, transform the content of the SVG tag, add the MIME header, and turn it into a Data URI that can be recognized by the Webview. Then, the SVG tag can be rendered by image component. For details, see “How to Display SVG in Small Programs”.

Since the image component of the applet does not support passing in width and height attributes, the values of these two attributes are concatenated to the style style in the parseStyle function of the Parser process. Finally, the data structure of nodes is shown in the following table.

The property name Secondary attribute name instructions Value types The default value
name Tag name string img
attr id withrich-textThe difference is that it can be parsed to the tagidattribute string
attr style The style,width,heightIntegrated into thestyleinside string
attr class class string
attr src Image links string
mode Picture cropping, zoom mode string scaleToFill/widthFix
index Image index for preview click number

Once you have the data structure you want, you can customize the template that the IMG tag renders.

5.2.1 Image processing

In general, the b-side rich text editor introduces the material images, when uploading the material library, are not the image processing. Therefore, when rendering rich text on the small program side, image links referenced by the IMG tag download high-quality original images. When downloading too many images, bandwidth will be wasted and performance will be affected.

In actual rich text presentations, there is no need to show too “hd” images. To solve this problem, Tencent cloud object storage provides image processing functions through imageMogr2 interface of data vientiane.

WXS can encapsulate an image processing function imageMogr2(SRC,options) and convert the value of the SRC attribute of the image tag.

As shown in the code above, if the IMG tag passes width and height attributes, the image needs to be cropped to prevent distortion. When the actual pixels of the image are greater than the width and height attributes passed in, the image needs to be scaled, provided the Webp attribute of the image component is true. Enable WebP compression. Of course, you can also add other processing, such as removing meta information, progressive loading of.jpg files, and so on.

5.2.2 Experience Optimization

When parser is parsed, all image links are put into imgList array for maintenance, and data-index attribute is bound to image component. When clicking on image triggers catchtap event, the image link can be found according to index index. Call wx.previewImage to preview the image.

When the lazy-load property of the image component is set to true, lazy loading of images is enabled until they are about to enter a certain range (up or down three screens).

When the show-menu-by-longpress property of the image component is set to true, longpress the image to display the menu of sending to friends, saving, saving, searching, opening business card/going to group chat/opening small program (if the image contains the corresponding two-dimensional code or small program code).

If all (or most) of the rich text content is images, even a large number of them will all go into the view scope, resulting in lazy loading failure, because their images are of zero size when not loaded. Fortunately, the image tag also supports binderror and bindload event bindings, in which logic can be added to display the placeholder map when the image is not loaded, and display the wrong placeholder map when the image fails to load.

6. A label rich text rendering

6.1 Problems

As mentioned earlier, rich-text renders the a tag, which filters out the href attribute and supports only the class and style attributes. In other words, it renders the A tag as plain text and loses the jump function.

6.2 Optimization Scheme

The data structure of tag A parsed by the parser process is as follows:

The property name Secondary attribute name instructions Value types The default value
name Tag name string a
attr id withrich-textThe difference is that it can be parsed to the tagidattribute string
attr style style string
attr class class string
attr href Skip links string
children Node array array

In the template template, if the node is an A tag, render as follows:

Use the view tag to simulate the effect of a tag. Define two default classes _a and _hover to simulate the original style and hover effect of a tag. Bind the catchtap event to achieve the link click effect, where the current node index is passed through the data-i attribute, and the node can be directly obtained through the logical layer traversal. Because the contents of the a tag nested child nodes are unpredictable, the child nodes are rendered using node components.

In the linkTap function, do the logic. The following types of jump logic are supported:

  1. Anchor jump. Support jump to internal anchor point, set the href attribute of a tag as # ID, click to jump to the corresponding ID position (set to # to jump to the beginning).

  2. Example Jump to an internal path. NavigateTo or wx.switchTab if you need to click the A TAB to jump to a page in the applet, set the href attribute to the page path to jump to the corresponding page using wx.navigateTo or wx.switchTab.

  3. Copy external links. For external links, since the applets cannot be opened directly, using wx.setclipboardData will be automatically copied to the clipboard.

In practical application, anchor point skip includes page anchor point skip and container anchor point skip. For container anchor point skip, the jump scope of anchor point should be limited within the container, which is commonly referred to as Scroll View, so the following three parameters need to be passed to the MP-HTML rich text rendering component.

Parameter names type mandatory The default value instructions
page object is Scroll view Instance of the page where the tag is located
selector string is Scroll view label selector
scrollTop string is The variable name bound to the scrollTop property of the scrollview label

For example, to display rich text in scroll view, write as follows:

Get the mp-HTML rich text rendering component instance from the current page with id= “article” and call the in method with three parameters, as shown below:

// The three parameters are page, selector, and scrollTop
ctx.in(this.'#scroll'.'top');
Copy the code

So the anchor jump implementation logic is shown in the figure below. The existence of this._in value indicates that anchor jump is carried out in the container, and conversely, it is carried out in the page. The implementation logic of both is slightly unclear, and the principle is to get the scrollTop value of the element to jump relative to the page/container. For container anchor point redirect, change the scrolltop property of the scroll-view for scroll positioning; for page anchor point redirect, use the wx.pageScrollTo method for scroll positioning.

7. Other customized label optimization

Table 7.1

When the table is too wide, beyond the normal width of the phone screen, it will spread the width of the container and cause the entire rich text content to scroll horizontally, affecting the user experience. In this case, you can resolve onCloseTag to the tag closed hook function, wrap the table node with a container that can scroll horizontally, such as the View tag, and set its style to overflow-x:auto; Padding: 1 px.

Since the applet does not support the table tag, different rendering schemes can be used for different tables, as shown below:

Display mode applicable instructions
rich-textThe label There are no special labels such as links and images inside the table For the best results, almost no conversion is required
tablelayout The table has special labels but does not use merged cells I’m going to have to do some transformationtable.tr.tdWait for the label to be converted to the corresponding layout
gridlayout The table has special labels and uses merged cells Complex transformations are required to merge cells to usegridLayout out

7.2 list

The rich-text component does not support multi-layer nesting of lists. Predefined styles can be used to support multi-layer nesting of lists. For unordered lists, different layers will display different styles, which can be implemented through list-style type.

At the same time, the type attribute can be exposed to support the user to choose to display numbers, letters, Roman numerals and other forms of labels. You can also set list-style:none to not display the li label.

7.3 audio and video

Htmlparse2 can convert the embed tag to the corresponding audio and video tag, and set the ID of the audio and video tag to obtain the context. Use array _source to store all available sources and array _videos to store all audio and video instances.

In the case of multiple videos, playing at the same time may affect the experience. In the callback of playing videos, you can play the current video by obtaining the ID of the current event, traverse the _videos array, and pause other videos at the same time.

Different platforms support different formats for playback, and setting only one SRC may cause compatibility problems and lead to playback failure. Therefore, this component supports setting multiple sources for video and audio, just like in HTML, and storing them in _source array through traversal, which will be loaded in sequence. Until it can play, to avoid the maximum unplayable.

8. The last

In this paper, the mp-HTML component is reformed based on business requirements, and the design ideas of the component and the optimization points of some tag rendering are described. The author of the component also shared the article “In-depth Study and Application of the Rich text capability of small programs”, and the code is also open source on Github. Want to know about the table, list, audio, video and other tags for more optimization ideas, you can consult the source code.

Creation is not easy, point a thumbs-up to go ೭(total ᴛ ʏ ᴛ Total)౨