By F(X) Team – 琻

background

Layout restoration is the core part of the D2C link. In the layout section, ImgCook uses a set of layout algorithms to transform design draft layers into proper layout structures to generate more development-friendly hierarchical tree structures (such as a page-side DOM structure or a Native XML descriptor).

Before the layout section, the plug-in structures the entire visual, resulting in a flat JSON that records the absolute position, size, and style of each element. With this flat JSON we can accurately restore the entire visual draft. However, in everyday development, the relationships between components are not flat absolute positions, but rather more complex ones: include and included, repeated use of the same module, different logical states of the same module, and so on (see figure below). Therefore, layout algorithms need to be further upgraded to have the ability to support these complex layouts.

(Circular body in “Double Eleven Guess you like”)


(Polymorphism in “Double Eleven”)

Problem analysis

From the following UI information Architecture diagram, a page is subdivided into six layers from top to bottom. From the design draft, we can only get _ element _ level information. To make imgCook’s generated page structure more consistent with the front-end developer’s coding logic, the layout algorithm needs to be able to combine elements to form components, blocks, or modules.

(UI information Architecture diagram)


Therefore, we use the following schemes to generate structured schema:

  • Page segmentation(Split the page into submodules)
  • group(Determine the relationship between inclusion and inclusion of components in submodules)
  • cycle(Determine the logical relationship of tiling and stacking of the same components within a submodule)
  • polymorphism(The logical relationship that determines the different states of the same component within a submodule)

With this framework in mind, IMGCook aims to push the layout in a smarter direction so that the resulting code is more responsive to developer expectations and more developer-friendly.

Technical solution

In the following chapters, I’ll focus on the ability to “split pages,” “loop” and “polymorphic.” In the development process of IMGCook, we will gradually replace the previous process of pure handwriting and pure intervention with more intelligence (higher ability grading stage), so that the whole restore link has greater expansion and generalization ability. However, since the intelligent model is not 100% accurate in extreme cases, we will retain the ability of the low level stage to ensure the “robustness” of the main link.

(Intelligent capability stratification in layout restoration stage)


Page segmentation

The first is page segmentation. When restoring the entire page using imgCook, we will slice the page and divide the entire page into several modules for maintenance.

L1 phase manual assisted generation: design draft module protocol

In the design draft, users can manually divide the page into specified modules through the design draft protocol, so as to strengthen the intervention of page segmentation. Add the protocol #module:Name# to the corresponding module to convert the corresponding part into a module (as shown below).

(Marked part)


Final generation (see component tree on left, module identified)


Automatic rule generation of L2 stage: structured data segmentation algorithm

In order to reduce manual intervention and improve the ability of automatic segmentation recognition, we use rules to match adjacent elements to determine whether these elements belong to the same submodule. The algorithm queries all adjacent rows and merges large block structures. The specific effect is shown in the figure below:


Intelligent auxiliary generation at L3 stage: CV edge detection

However, rule-based page segmentation lacks generalization (only the edges of text and images can be identified as the same module). In the actual application process can not get very good results. So we came up with the second version, which uses computer vision to intelligently divide modules. We use CV to compare the designs at the pixel level to enhance the generalization degree of recognition.


However, it can be seen from the figure above that the edge detection based on CV currently has stronger recognition and generalization ability than the rule. But:

  • Pixel-level segmentation can be over-segmented, splitting parts that should not be split into two modules (e.g., title bar and content section);
  • In addition, pixel-based detection is very demanding on server performance. According to the experimental data, in the 8-core multi-threaded environment, it takes 2 seconds to complete the detection of a page design draft. This is very time-consuming for the whole restore link. Therefore, we plan to reduce the number of pixels by dimensionality reduction and pooling of images, so as to reduce the cost of computing and improve the efficiency of recognition.

Loop identification

Circular layout is a particularly common layout pattern in interface design. Lists, navigation tabs, and multicast components all use loops. In the process of writing code, the rational use of loop can make the code structure more reasonable, and can greatly improve the code efficiency. For example, we just need to implement a child component and loop through the child vertically to get a complete list component (as shown below) :


  • Identify the phase
  • Mark phase
  • Generate the phase

In the identification stage, we extract the cycle body from the whole schema through algorithms, models and other methods. In the labeling stage, we will screen the loop elements in the loop body and label them with serial number and unique representation of the loop. Finally, the logical library will bind all the loop variables on the screened loop body and generate them in batches.

In this chapter, we focus on the first phase of the cycle, the identification phase.

Identification scheme



Phase L1 manual assisted generation: design draft circulation protocol

As a base annotation capability, ImgCook provides node labeling methods that force certain elements to be identified as loops. Loop elements can be identified as loop child elements (as shown below) by prefixing them with a continuous #loop# tag in the design software. Generate the code under the following annotation to get a “5-loop node”.



Automatic rule generation at L2 stage: cyclic detection algorithm

Before we can identify loops, we need to first understand why there is a loop layout? The loop layout of the front end is largely related to its corresponding server-side abstract data structure. In the e-commerce industry (especially mobile shopping), most items are presented as lists or feeds streams, the counterpart of which is an ArrayList in an abstract data structure. Therefore, similar data structures in the same component in the corresponding front-end style is also presented in a loop.


  • The elements of the loop are generally under the same parent node.
  • Loop elements generally have similar shape and style attributes.

Take, for example, the cards that are very common in the marketing world (see below). Each card has a similar layout: a square header, a large font title, descriptive text, and clearly identifiable action points.


L3 stage intelligent assisted generation: cycle recognition model algorithm

Both L1 and L2 phases are artificially defined node traversal algorithms based on rules, and the rule-based method cannot necessarily deal with the cases that do not conform to rules. For this reason, D2C turns its attention to the current artificial intelligence field and uses feature extraction ability in deep learning to find layout features from massive data, achieve end-to-end layout recognition ability, and reduce even rules without artificial definition.

After investigation, generative adversarial network stands out in the AI field as a novel data generation model. In the intelligent layout recognition algorithm, this paper introduces the generative adversarial network, which is popular in the CV field at present, expecting that generative adversarial network can find out the layout rules and characteristics, and convert all elements in the same layout into styles. The working principle and practical experience of this paper are described in detail below.

Conditional generative adversarial networks

In 2014, Goodfellow proposed a heavyweight deep learning model called Generative Adversarial Networks (GAN). As an excellent image generation algorithm, GAN has emerged as one of the most promising methods of unsupervised learning. There are two components in GAN: a generator G (Generative Model) and a Discriminative Model D (Discriminative Model). The generator generates synthetic data that matches the distribution of real data, and the discriminator determines whether the data is true or false. The two play against each other to improve their respective abilities. Deep convolutional neural networks are often used in generator and discriminator design because they can fit any function. The principle framework of GAN is as follows:



Conditional Generative Adversarial Networks (CGAN) is an evolutionary version of GAN technology, which can generate synthetic data matching the real data distribution according to the input conditions. It adds supervisory information on the original GAN. Specifically, traditional GAN learns image y:G:z->y from random vector Z (noise); Different from traditional GAN, CGAN directly learns a mapping from the conditional image, that is, S :G(y,z)-> S, where Y is the conditional graph and S is the composite graph generated by the generator.

Model training and practice

Data set making: First of all, data set making is carried out to transform the style of the layout of the same group. The corresponding label is a white block style. The generated training data is shown as follows:

Model training:


Effect demonstration: Randomly selected images from the test set for testing, testing results prove that the model is effective.




(Recognition effect)


It can be seen from the above figure that the model replaced each group in the picture with a white area, and the layout was grouped successfully.

Express scheme

When the loop is identified by the layout algorithm, the loop body information is tagged under the smart.repeat field of the schema. The information describes which elements are in the body of the loop, the number of digits in the body of the loop for each element, and so on.

Finally, in the logical library, imgCook maps an Array to the component in the form array.map. In the following example, a list of items is generated in a loop:


Polymorphic identification

In addition to cyclic layout, multi-status is also a very important part of front-end coding. An element may have different display and action states in different states. For example, in the case of the following item card, the “Buy” button in the lower right corner has three different states for availability: “Temporarily out of stock,” “Book now,” and “Order now.” They have similar appearance, location and layout, however, they also have partial differences:

  • Show differences (color, length, background)
  • Logic point difference (different logic code executed after click)

Such front-end patterns are called polymorphisms, and ImgCook is building up its ability to recognize them



Identification scheme

Phase L1 manual assisted generation: design draft polymorphic protocol

In addition to algorithm recognition, we also provide a manual labeling method, so that when the algorithm fails to accurately identify, the generation of polymorphisms can be manually intervened. To be recognized as polymorphic in the layout algorithm, bind multiple elements to multiple states of a single element by selecting generate Element Multistates from imgCook’s menu (or by using the shortcut Ctrl + Shift + M).


L2 stage rule automatic generation: polymorphic detection rule algorithm

The polymorphic recognition algorithm adopts the logic similar to cyclic recognition. First of all, it is the last layer of the full restore link, which can extract the different states of the same element, and carry out uniform style modification and state restore and binding.


  • Start by finding different states of the same container and analyzing its child elements
  • Find the child elements that are close to each other and check if they have multiple states
  • If the child elements have style differences, and they are similar but not identical by similarity calculation, then they are marked polymorphic
  • Finally, the child elements that have polymorphism are removed and the style layouts of the different states are combined

The visualization of the algorithm can be seen in the following giFs:


L3 stage intelligent assisted generation: polymorphic recognition model algorithm

Currently, we are advancing the development of polymorphisms from I2 to I3, using modeling methods to identify possible polymorphisms in design drafts.


  • Input the whole design draft as image information;
  • The element clusters that may be polymorphic are detected by YOLO target.
  • The semantic similarity of elements in each element cluster is calculated.

Express scheme

When polymorphism in layout algorithm identification, smart in schema layerProtocol. MultiStatus field, will be labeled polymorphic information. The information describes which elements are in a polymorphic cluster and which states each element corresponds to.

Finally, in the logical library, ImgCook uses the condition field to map the presentation criteria for each state into abstract logical data. Once the “condition” field is bound, you can preview the style/logic of the module in different states by switching between different data (figure below).


Layout maintainability measures

While constantly optimizing the layout algorithm, we need a system to measure the optimization degree of the algorithm for the whole restore link: how helpful is our algorithm and model optimization to the generated code? Are these optimizations reasonable? Is it really r&d efficiency? Therefore, we developed two layout maintainability measures to evaluate the accuracy of layout restoration:

  • UI restoration metrics: Measures whether the layout algorithm is 100% visually restored.
  • Measurement of layout maintainability: Measures the soundness of the generated layout structure based on the schema the user last saved.

UI restore accuracy

UI restoration measure CV comparison was made between the original design picture and the schema after layout restoration and the view rendered after DSL code generation. Visual similarity and DOM structure complexity were used as the restoration measure to calculate the judgment criteria.


However, UI restoration metrics can only measure the consistency of the rendered UI and visual artwork and cannot guarantee the rationality of the code structure, so we need another way to measure the maintainability of the layout structure:

Layout maintainability

In the measurement of layout maintainability, we compared the schema generated by layout restoration with the schema saved by users after modification in the editor, and estimated the restoration effect and availability by calculating users’ modification momentum.

Schema changes have four possibilities: node changes, location changes, style changes, and attribute changes. Among them, the last two are general ability, have little impact on the layout; The first two are the core capabilities of the layout algorithm. Therefore, when calculating availability, the weight of the former is higher than that of the latter. After that, the overall availability can be calculated as long as the proportion of the changing part is integrated.

We define a formula to calculate the maintainability of layout restore:



(View panel can be changed layout)


In addition to the field binding, node attributes, and style changes, a large number of loop nodes were removed in the final save, but these loop nodes were not manually removed, but were automatically deleted in the subsequent business logic generation phase after loops were identified in the layout restore phase.

In the 2020 Double 11 Conference, the modules with circular bodies accounted for 67.31% of the newly added modules, and the identified modules accounted for 43% in the stage of layout restoration. In the stage of business logic generation, redundant circular bodies were automatically deleted and circular structure codes were generated according to the result of layout recognition.



F (X) Team has opened a microblog!
In addition to the article there is more team content to unlock 🔓