Author: Idle Fish Technology — Jing Page

background

Xianyu Will play community is a content community to share personal interest and life. In the operation process of the community, there are often some articles in the form of content on the play square. Previously, article creation was completed on a platform mainly carrying marketing construction, which was not dedicated to serving the article scene. As a result, the constructed page was a static page, and the data could not be understood, nor could it enter the audit and distribution link. Moreover, the scaffolding tool is not available to outsiders, so external writers cannot participate in the creation of articles. Therefore, we wanted to develop a publishing tool from scratch that could publish quickly and in a highly extensible article structure form.

research

There are many similar products in the industry, such as zhihu, Nuggets and wechat public accounts. The articles of wechat public account have done well in terms of publishing experience, consumption experience and content structure expansion. We summarize the following characteristics of wechat public number articles:

1. High content and form richness; 2. Excellent browsing experience; 3. Support third-party content templates;

Xianyu will play the article in the hope of content form, browsing experience as far as possible to align with the standard of wechat public number articles. But idle fish play articles and WeChat public also to have certain difference, WeChat number for different from the media, public display of different vertical type from the media on the article essence is different, so don’t do too many restrictions on the content of the article style, while idle fish community has its own tonal and own brand theme color, content form will be more convergence.

The target

The content structure of the article has a high degree of freedom, and the content can be arranged in headings, text paragraphs, images, and permutations and combinations of interactive components (such as voting machines, link cards, etc.), so high scalability is very important. In addition, as an entry point for article creation, publishers need to store all content information of creators, including font, size, color, paragraph, picture and so on. The performance experience of the display page on the end is also very important. So we set the following key goals for the system:

1. Information restoration degree is 100%; 2. Content structure is extensible; 3

The general process of link publishing is as follows:

plan

Around the established design objectives, how to achieve it? Our solution is to express all the information of the article content according to a set of agreed schema protocol, which records and expresses all the information that needs to be displayed in the client side. Finally, the publisher is connected to the display page through this protocol. As can be seen, the design of a set of general, concise article content protocol is the key to the scheme. The article publisher is responsible for generating the schema. After structured storage, the article presentation page obtains the schema and then analyzes the protocol on the end and displays the corresponding content information. The schematic diagram of the scheme is shown below, with orange indicating the relevant part of the protocol, which appears in almost the entire link.

Protocol design

The protocol design objectives are as follows:

1. The rules are simple and easy to understand. The simpler the rules, the easier it is to create and parse schema data according to the protocol. 2. Scalability. Future article forms can be very rich, requiring the use of this protocol to express any form of content. 3. Structured storage. The graphic content can be extracted and stored in a structured manner. The schema of the content of the article may be very large, and the database storage field generally has a character limit, so the schema needs to be compressed. In addition, the entity content of the article needs to enter the security audit link, which also requires us to carry out structured content extraction.

Centering on the above three demands, we customized a set of protocol rules that met the requirements based on dingding rich text protocol.

Protocol logic

Existing element tag

• ROOT – Root element • P – Text label • SPAN – text label • IMG – Picture label • H2 – Secondary paragraph label • Card – card label

•idleVideoCard – Video card

The existing attributes

General properties

Image properties

Video attribute

How to extend

We stipulate that all new plug-ins in the future can be used as a sub-class of cards, and the card type can be customized. The card data can be uniformly placed in the metadata field, and then corresponding components can be made on the end according to the card type, and the card data information in metadata can be imported into the components as parameters, so as to achieve any plug-in in the future. Can be mapped into the protocol.

Structured storage

The protocol is a JSON schema, which is a headache to store in a relational database. Graphic content store divides content into three fields: text, image array, and custom extension fields. Structured information such as text and images is used for security audits and algorithm recommendation recognition, while custom fields are used to store other business information. Our first version of the solution was to separate the text and image content and put the entire JSON string into a custom field so that it could be saved and read. However, in real scenarios, custom fields have character limits. Therefore, the JSON string needs to be translated and compressed to retain only the necessary styling and typography information.

The publishing

Publisher is an article creation tool for creators, consisting of header graph, article title and article details editor. The core is a rich text editor, through which the protocol schema that conforms to our convention is generated and all content information is recorded. The mainstream editors in the market include the open source slate.js and Facebook draft.js, and the mature rich text tools in the group include The Rich text Editor Of Language And we-Editor developed by The Dingdingdocument team. As mentioned above, the protocol of Dinged rich text editor basically meets our demands, so we directly use WE-Editor for rich text selection.In the use of rich text editors, there are only two scenarios, one is to write an article in the editor by hand, and the other is to write it somewhere else and paste it into the editor. The scene of a handwritten article is relatively easy to control because all styles are controlled by plug-ins in the editor and the styles are manageable. However, the content copied in paste scene itself has rich text style, which leads to uncontrollable article style and chaotic rich text protocol content, which is not conducive to maintenance and expansion.

Processing Pasting Scenarios

Pasted in content are tags with inline style attributes, such as div, SPAN, A, H1, H2, IMG, video, etc. Our approach is to clear all inline styles at paste time and only deal with labels within the format range. For img and video tags, more processing is needed, because img and video SRC are one address link. If these links are connected outside the site, there will be cross-domain access and security risks for the platform. The way is to do the offsite resource dump processing, that is, after downloading the offsite link, the resource will be transferred to the reliable resource server through the internal service.

Content structure extension

Content structure extension is done through a custom editor plug-in. Develop and customize rich text plug-ins that conform to design specifications. We only kept the basic capabilities of the editor such as redo/undo, bold font, font alignment, and adding images. The rest of the capabilities, such as video and linking, were done by custom plugins. By encapsulating the We-Editor plug-in architecture, developers can develop plug-ins in the same way they develop react components. The encapsulation process is to treat the extension as a kind of card. In the schema, specify the toolbar content, corresponding click events, and the card style to insert rich text, etc., you can insert any plug-in. Take a plug-in that inserts a video:When clicking the plug-in toolbar button, select insert card of type idleVideoCard.

Functional service layer

In the publishing link of the publisher, we designed a faAS function service layer. Consider the following reasons.

1. Security. The structured information extraction algorithm should be calculated on the server side, otherwise there will be a vulnerability that the brush interface bypasses the security link. 2. Extraction and reduction are implemented through JS. So as to ensure a set of rules technology stack unity.

Below is a schematic diagram of the data flow:

The article shows

The principle presented in this paper is to restore the processed schema to the real schema information through protocol rules, and transform the parsing information into the corresponding visual components. I will focus on protocol parsing and performance tuning.

Protocol parsing

Theoretically, as long as the rich text information expressed by the rich text Schema protocol can be correctly analyzed, it can be restored to any corresponding design specification on the end, which is also the foundation for us to do the group unified article publishing tool in the future. That is, as long as the protocol is consistent, the display on the end can be quite different. The return function pseudocode is as follows:! [code5. PNG]) (img.alicdn.com/imgextra/i4…).

Experience optimization

Performance optimization of the front end is an old topic, and due to limited space, here are just a few of the major optimizations in the details page. In terms of performance data results, the cross-end first screen rendering time was optimized from 1700ms to about 1000ms, reaching the opening in seconds. First, the comparison of effects before and after optimization:Before talking about the specific scheme, let’s take a look at an H5 page in the WebView is the loading process:In this link, the most time-consuming is the IO of various resources, including the IO of page documents, the IO of style files, JS files and images, and the IO of data interface requests. The second time consuming is the webView startup time. Therefore, our optimization mainly focuses on the idea of reducing IO and advancing IO.

1. Resources combo

In addition to the JS file containing business, the page load also contains jstracker resources, RAX framework resources, security related JS, etc. Combining these resources into a resource request can reduce many request IO, thus reducing the first screen rendering time.

1. Lazy loading of images

There are usually many images in the article, most of which will not appear in the first screen, so you can not load the images that do not appear on the screen, and wait for the user to slide down to the image that appears on the screen, then request the image resource.

1. Local resource cache

Download the document resources and js resources in general is a long time the same thing, if these resources in the client in advance when free download good, when the request these resources, the client found the resources of the local have the same name, intercepting the resource request, to return to the local cache good resources, can greatly reduce the first screen rendering time.

1. Prefetch data

The length of the first screen rendering depends in part on the return speed of the interface called for the first time, and the interface request is generally not issued until the JS logic triggers the interface request. If the request for the first screen is a definite parameter, can you request the interface earlier? Our data prefetch scheme is to bring the interface parameters that need to be requested on the first screen in the request URL of the page clicked by the user, and then the client asynchronously requests the data after obtaining this parameter and caches the results to the client. When the JS logic needs to send a request, it determines whether the current request has been requested. If so, it directly returns the interface data cached on the client.

1. Delete client loading

Client loading is an inherent capability of webView, indicating that the current page is still loaded. However, although resources are being loaded, the page information on the first screen has already been loaded. Removing client loading can give users a faster sense of motion, although it does not actually speed up the flow process. The optimized loading process is shown as follows:Performance optimization is endless, so the optimization of the article display page will continue, you can consider starting the WebView container in advance, ESR, NSR and other ways to optimize.

Looking forward to

So far we have completed the construction of the article publishing tool from 0 to 1, and there is much more to do in the future. Based on this highly extensible protocol, more and richer article content forms and interactive gameplay can be expanded, such as voting machine, bullet screen, article template, etc. Finally, an open template plug-in development system based on the current protocol can be deposited to adapt to different article content systems. In addition, performance experience optimization will continue to be optimized to a high standard. On a higher level, article content is just one form of hosting that will play with community content, and we want a creator site tool with more creative capabilities. The author publishing center on THE PC side and the author publishing center on the mobile end are complementary. Ordinary creators are more inclined to send on arrival, at will and at any time, while some PGC creators who have higher pursuit of content quality pay more attention to professional indicators such as content quality, release efficiency and content consumption data. Therefore, after completing the basic publishing capability, we will gradually improve the whole creator publishing link, which is a complete professional creator publishing tool integrating ordinary graphic creation, video content creation, article creation, content management, data center and hot content discovery.