Design draft (UI view) automatic code generation solution exploration

1 background

Design draft (UI view) transfer to code is the daily repetitive work of front-end engineers. This part of work has low complexity but takes a high proportion of work. Therefore, improving the efficiency of design draft transfer to code has always been one of the directions pursued by front-end engineers. Previously, front-end engineers tried to build business components into common view library modularized, and built business modules through drag-and-drop, splicing and other forms, so as to realize view reuse and reduce the research and development cost of design draft to code. However, with the development of business and the drive of personalization, the universal view library cannot cover all application scenarios. This paper proposes a scheme to automatically generate code for design draft.

At present, there are two mainstream code generation schemes in the industry. One is to generate code directly from pictures or sketches by training neural networks, represented by Microsoft Sketch2JSON. The other is based on the Sketch source file, which parses the layer information into DSL and generates the code, represented by imgCook. Through practice, we found that although the code generation algorithm based on neural network of the first scheme is simple and rough, the accuracy of complex layer layout is low and the degree of interpretation is not high, leading to the failure of continuous optimization in the subsequent. In the second scheme, Sketch source file has rich information, high degree of algorithm customization and large optimization space. Therefore, we research the industry code automatic generation scheme based on the Sketch (already announced or open source), found some shortcomings and try to solve, the following accuracy, code readability, research and development process from algorithm coverage do some aspects such as contrast, the contrast results the applicability of the scheme is only for our own business, the actual results may exist differences) :

Algorithm accuracy: Taobao imgCook support for component recognition based on AI, does not support group layout, the accuracy of medium (from the website to the loop layout can be recognized, but it can’t identify the test sample in the loop layout), 58 Picasso was only support the original component identification, generate errors more complex components, does not support group/suspension loop layout, low accuracy.
Code readability: When Taobao imgCook generated the layout, the layer overlapping area in the test sample used the absolute positioning method based on the root layout, which was not in line with RD’s expectation, and the readability was normal, while our scheme used the relative positioning method, which was good.
Research and development process coverage: Taobao imgCook built an IDE from the perspective of RD, which supports completing style adjustment and logical binding in the IDE; From the perspective of industry-research collaboration, our scheme supports the visual configuration and on-line of data, logic and buried points.

2 Solution Introduction

As shown in the figure, the configuration platform is mainly divided into three parts: design draft to view tree (UI2DSL), view tree to code (DSL2Code), and business information binding. The functions of each part are briefly introduced below.

Design to DSL View Tree (UI2DSL) : Converts design to a platform-independent DSL view tree.
View Tree to code (DSL2Code) : Convert DSL view tree to MTFlexBox static code based on Flex layout.
Business information binding: Provide visual configuration tool, support MTFlexBox static code binding background data, business logic, and exposure/click buried logic.

2.1 Design draft view Tree (UI2DSL)

UI2DSL goes through the following four steps:

2.1.1 Design draft import

In the daily development process, we are exposed to many components such as buttons, titles, progress bars, scoring components, etc. However, there is no such component in the Sketch data source, only the layer information, which is the view control used by designers when designing UI views. The corresponding relationship between components and layers is one-to-many. The representation of layers in Sketch data source is shown in the JSON data structure in the following figure, which describes the coordinate, size and other information of layers. Subsequent layout generation is realized based on layer cutting.

[{"class_name":"MSTextLayer", "font_face":" pingfangsc-medium ", "font_size":13.44, "height":36.5, "index":8, Object_id :" ef55F482-A690-4ec2-8a6E-6E7D2C6a9d91 ", object_id :" ef55F482-A690-4ec2-8a6E-6E7D2C6a9d91 ", object_id :" ef55F482-A690-4EC2-8a6E7D2C6a9d91 ", "Opacity ":0.9000000357627869, "text":" 400g±25g", "text_align":"left", "text_color":"#FF000000", "Type" : "text", "width" : 171.8 "x" : 164.2, "y" : 726.7}, / /...Copy the code

2.1.2 Component Identification

From the data source above, we can see that the layers have basic types such as image, text, rectangle, etc. In the component recognition step, the layer needs to be converted to the type of components used in daily development, such as text/image/progress bar/rating component/price component/corner marker. However, at present, our progress is still at the stage where we can only recognize the layer as text or picture. In the future, we will access the open source Pipcook framework of Taobao to carry out richer component type recognition based on neural network algorithm.

2.1.3 Visual intervention

Design draft as input source is the basis of automatic code transfer from design draft, which has high requirements for design specification. However, in practice, we found that designers would use the basic graphics in Sketch (each graphic eventually forms a layer in the data source) to describe the visual effect of a component. Therefore, the problem of redundant layers is inevitable in the design draft, which interferes with the generation of DSL. Although we also try to use automated means to delete redundant layers, but the algorithm does not recognize the part (for example, have a text layer on the picture, but the actual situation in Chinese this is shown in the pictures, this time not from the aspect of algorithm to decide whether to delete text), still need to layer, delete and combinations by artificial, otherwise unable to properly generate a DSL. The design draft mainly has the following kinds of problems.

Layer unmerged

Above comes from the design draft resolution as a result, can be found in “Meituan optimization” in the text at the top of the picture has a lot of red rectangular box, each rectangular box is a single layer), and the algorithm of the expected input is a layer, so you need to in front of the algorithm to deal with multiple layers merged into one layer, three on the right side of the picture has a similar problem. We have communicated with the design students, and the design students expressed their willingness to merge the layers before producing the design draft. However, since there is no detection mechanism (whether there is any omission in the layer merge cannot be automatically detected), the problem of unmerged layers cannot be completely avoided.

Cross layer positions

In practice, it is found that when the characters of different fonts/sizes/colors are arranged together in the design draft, the layer information resolved often overlaps. Since the DSL view tree algorithm relies on position to determine the constraint relationship of different components, the intersection of positions will have a great impact on the accuracy of the algorithm.

Complex background layer

The red background in the image above is a patchwork of two layers (two blue rectangles), the blue layer on the left is a solid color, and the blue layer on the right is a gradient. If the two layers are not merged, the code generated by the algorithm will fail.

For the above problems, it is difficult to standardize the design draft by restricting the designer, so we provide visual intervention tools. Here is a simple summary of the above issues:

Problem 1: layer unmerged problem is easy to identify with the naked eye, using tools to quickly merge and delete redundant layers.
Problem 2: Layer cross problem is not easy to be recognized by the naked eye, so we provide detection tools, which can quickly fix the cross problem in the design draft.
Problem 3: Complex background problems are not easy to be identified by the naked eye, and there is no effective detection tool for the time being. Users can generate DSL by intervening while generating.

Visual intervention is an important part of UI2DSL. After visual intervention, the non-standard design draft is transformed into standard layer information and then input to the algorithm, which can greatly improve the accuracy of the algorithm. Here’s a difference between our approach and imgCook’s: ImgCook after introduced the threshold processing algorithm (more intelligence, bigger error probability), visual interference ability is mainly embodied in the after the event, and we are in the generated DSL allows users to layer before intervention, the intervention when the user with intuitive layer information, can effectively reduce the threshold of the use of tools, more stable, the effect is better.

2.1.4 View tree generation

How would a human brain think about the process of turning a flat data source into a tree-like DSL? Determine whether the overall structure of the layout is a row or column layout, then determine what the local area layout should be, and finally assemble it to form a view tree. This process is similar to the recursive algorithm, so we use the recursive algorithm as the main framework of the algorithm, and introduce “horizontal and vertical cutting + layout structure + model evaluation” three sharp tools.

Sharp weapon one: horizontal and vertical cutting

The whole idea is used to generate the DSL, where a large layout is continuously broken down into smaller layouts. Here is a simplified DSL generation process in the form of animation:

Will be part of the design draft regional as a child, the beginning area as large as the size of template, based on the location, size of the layer, each layer of up/down/left/right edge coordinates relative relationship with the other layer, you can find the cutting point (such as the position of the red arrow in the above). Next, the subregion is cut into smaller subregions according to the cut point. In the process of cutting, if the cut point is horizontal, the column layout is generated. If the cut point is vertical, a row layout is generated. The complete DSL-view tree can be generated by continuously cutting subareas into smaller subareas until all of them are uncut layers such as images or text. For the reader’s convenience, the legend only shows the segmentation process of row layout and column layout. The actual situation includes other layout types, which are much more complex.

Here also notice a problem, when there are three cutting point, we chose the subdomain cut into four subsystems are directly related to the area, in fact we can only choose one cutting point for cutting, can also choose two cutting point for cutting, cutting point, when there are N actually exists (N factorial + 1) ways of cutting, cutting way you choose, We’ll talk about that in Sharp Weapon three.

Weapon two: layout structure

Each layer is a rectangle, and in order to generate a layout structure you can only rely on the rectangle’s left, right, up, down coordinates. Therefore, when classifying layout structures, we make the following classifications according to the positional relationships (intersection, separation and inclusion) between rectangles.

Note: From the results of the DSL generation, the included layout and the group layout are treated the same, using FrameLayout like a stack layout to contain the inner layer elements, but we still keep the classification principle (the position between the rectangles) the same.

In the picture above, separation and inclusion are easy to understand. Why are there two types of layout structures, group and hover, when two layers intersect? Let’s take A look at the above two design drafts of group layout and floating layout, which respectively indicate intersecting elements A and B. Their relative relationship in position is the same, because the corresponding rectangular boxes of two layers A and B intersect. However, we want the ideal DSL view tree to be different, as shown below:

In group layout: A and B are A whole logically, and crossover is inevitable. In the final DSL, A and B are included in the cascading layout, and there are no other elements in the cascading layout.
In A floating layout: A and B are not logically integral, but just happen to intersect, and A and B end up at different levels in the DSL.

Therefore, there are two possible types of layout structures where layers intersect: group layout and hover layout. As you can see from the figure above, whether to use a group layout or a hover layout is determined by the layer content, so it is up to the algorithm to understand the layer content, such as building the sample library based on AI, remembering all the corner styles (as described in table 4 above), and generating the group layout the next time the corner styles intersect. Considering that the AI model is also an abstraction of rules, we first build a set of custom recognition rules. For example, corner marks often appear in the upper right corner, labels often appear in the upper left corner, and heads are often crossed horizontally or vertically. Therefore, we built a cross model for the position relationship between layers, as shown in the picture below:

The cross model above can remember the position relationship between the group layout layers in the history template. When the next intersecting layout is encountered, it can be identified by judging whether it is in the historical rule base. If so, it will be treated as the group layout; otherwise, it will be treated as the floating layout. The following figure shows a group rule base built using a history template.

The five layout types covered in this scenario are described above, and so far they can describe all the template layouts and generate code that RD expects. Here are two examples of design DSLS:

Tool 3: Model evaluation

When looking at vertical and horizontal cutting, you can see that when there are multiple cut points, all of them are cut at the same time, but in fact the algorithm is more complicated when it comes to cutting. When there are three cut points, there are actually five ways of cutting, each of which generates a DSL. Since there are five ways to cut, which DSL should you choose? The model evaluation algorithm is designed to solve this problem.

The current model evaluation algorithm has two indexes: layout node number and inverse layout index.

The smaller the number of layout nodes in the DSL, the better the cutting method.
The inverse layout index is used to evaluate the reasonable degree of row and row layout in DSL. The larger the inverse layout index is, the more unreasonable it is; conversely, the smaller the inverse layout index is, the better the cutting method is.

The following example shows the corresponding model evaluation methods under different cutting modes of views:

If the model evaluation algorithm only measures the number of layout nodes, then the DSL generated by the first cut will be chosen as the final result. But actually, the second way of cutting makes more sense. In cutting way one, advertising, immediately make an appointment in a column layout, but the horizontal alignment (cross axis) is not the same, “advertising” is the right alignment, immediately make an appointment is left-aligned, inverse layout index said cross shaft alignment is not consistent the number of nodes, so by the inverse layout index, we can avoid the unreasonable way of cutting.

2.1.5 List layout

The last section introduced the basic layout structures that describe all UI layouts, but there are some differences from RD coding conventions.

For the above layout, RD usually doesn’t write the same item five times. Instead, RD puts the item in a listView-like list component to make the code look clean and easy to read. Therefore, in the DSL generation phase, in addition to the basic row/column/contain/group/hover layout, it is necessary to further identify whether the elements in the row/column layout form a list layout. In our experiments, we discovered that there are two types of list layouts: single-state list components and multi-state list components. In the figure above, the layout structure of each item is the same, we call it the single state list component, and look at the multi-state list component (as shown in the figure below). Each item has multiple states (selected and unselected), and the layout structure of different states is inconsistent.

To identify the single-state list component in the row/column layout, it only needs to compare the structure of the sub-view tree of Item. If the structure of the sub-view tree is consistent, it is judged to be a single-state list component. We adopt the method of automatic recognition + manual intervention for the identification of multi-state list components. The automatic recognition method is rough. As long as the width/height of the neutron item in the row layout is close to each other and the sub-item is not the basic component (the basic component is prone to misjudgment), it will be judged as a multi-state list component. The specific algorithm is to calculate the standard deviation of the width and height of the sub-item. If the sub-item is less than the threshold, it is determined to be a multi-state list component; otherwise, it is not. The formula is as follows:

So why the human intervention? Because the use of list components is really related to the product logic, we can’t identify the logic in the product documentation at this point, but we can only identify all the multi-state list components as much as possible and allow the user to change the generated results. Such as the lover’s designs, product may agree every item has selected state/not selected state two kinds of state, may also come from the business point of view this item need to focus on outstanding send lovers, then each item only a certain state, the two different product logic in the code are different optimal technical scheme.

2.2 View Tree to Code (DSL2Code)

DSL view tree is only the intermediate product of code generation, and the DSL code needs to be restored. DSL2Code mainly includes two steps: attribute inference and attribute information adjustment.

2.2.1 Attribute inference

Attribute inference consists of two parts: style attributes and structure attributes. Style attributes include font, background color, rounded corners and other attributes that can be directly obtained from the data source information. Structural attributes include size, internal and external margins, alignment of primary and auxiliary axes and other structural information, which cannot be directly obtained from data sources, so the inference of structural information is the focus of this part of the work.

The structure information inference algorithm also uses the recursion algorithm as the main framework, through a recursion to all elements twice to complete the structure information inference. As shown in the figure below, during the recursive traversal of all DSL nodes, all elements are added to the queue in turn, and after the completion of the recursion, all nodes are removed from the queue in turn. In this way, two traversal of all elements are completed, which are called in-queue traversal and out-of-queue traversal.

The inference algorithm records the size and position information of each node according to the information in the data source, and calculates the expected alignment of the primary and auxiliary axes and the inner and outer margins of each child node in the parent node according to the position relationship. During the line traversal, the parent node will determine the final alignment of the primary and secondary axes of the parent node according to the expected alignment of the child node, and modify the size of the parent node according to the stretching intention of the child node. The stretching intention is that the size of the node is not fixed, and it may become larger or smaller in the horizontal or vertical direction according to the different content displayed. For example, the text node will change according to the length of the number of words displayed, and there will even be line wrapping when the number of words is too much.

2.2.2 Adjustment of property information

Since the input source is based on the static effect diagram presented in the design draft, each element in the design draft is missing the real business meaning. The same display effect will have different attribute requirements in different business scenarios. For this part of content, we cannot accurately infer from the input source. To this end, we provide a visual property information adjustment function to assist code generation. The page effect is shown below, where all node properties in the DSL can be viewed and modified.

After the business information is added, the final automatic code conversion is performed, which automatically converts the DSL into MTFlexbox template code through syntax mapping.

3 Achievement Presentation

Below is a screenshot of the mobile phone after the design draft directly generated code is displayed without modification. You can see the good restoration effect achieved:

This is our recent exploration and practice of automatic code generation, and we will introduce machine learning and neural network algorithms to further optimize the process. If you have any other comments or suggestions, please feel free to comment or contact us at the end of this article.

Author’s brief introduction

Tian Bei, Shao Kuan, Fei Fei, etc., r&d engineers of Meituan Platform terminal business R&D team.

Team introduction

The responsibility of the R&D team of Meituan platform terminal business is to ensure the efficient and stable iteration of the platform business, and to continuously optimize the user experience and r&d efficiency. The business of the team mainly includes 10-million-dau high frequency business such as meituan homepage, basic business such as sharing, account, audio/video, etc., and support and connection with more than 30 business parties such as takeout and hotels. Through dynamic capacity building, the team speeds up business launch, helps product (PM) quickly verify business selection and make business decisions; Architecture/service standardization system construction, improve the communication and cooperation efficiency of the front and back end, platform and business line; Business monitoring and experience optimization, effectively ensuring the success rate of core business services, while improving the stability and fluency of users’ use of Meituan App. Team development technology stack includes Android, iOS, React Native, Flexbox, etc.

Recruitment information

Meituan platform Terminal business RESEARCH and development team is a dynamic and passionate team, now we are looking for Android, iOS, FE engineers, please send your resume to [email protected] (note: Meituan Platform Terminal business Research and development team).

Read more technical articles from the Meituan technical team

| in the public bar menu dialog reply goodies for [2020], [2019] special purchases, goodies for [2018], [2017] special purchases such as keywords, to view Meituan technology team calendar year essay collection.

| this paper Meituan produced by the technical team, the copyright ownership Meituan. You are welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication. Please note “Reprinted from Meituan technical team”. This article may not be reproduced or used commercially without permission. For any commercial activities, please send an email to [email protected] to apply for authorization.