Front-end intelligence: technical difficulties and thinking of visual draft image recognition

Article / 8, fai

What is the image generation code

Imgcook image-generating code service, which can convert images into codes including Flutter and H5, and obtain the location and attributes of elements, as shown below.

What does the image generation code do

Image generation code, specific steps can be divided into the following steps:

Layout analysis: extract outline and element attribute extraction: obtain text, picture, outline and other attributes of the layout derivation: obtain repeated layout, GridView, ListView and other layout types code generation: code translation to generate the corresponding codes of Flutter, H5, etcCopy the code

This paper will introduce some difficulties and thinking of layout analysis module.

Technical difficulties and thinking

Iteration 1: Traditional image processing

In the first edition, we naturally thought of using machine vision to do edge detection and row and row projection to obtain the corresponding contour and elements.

But there are obvious problems with this version:

1) If the color of front and rear scenes is similar, it will be impossible to recall
2) Element stacking loses element stacking

For example, the play button shown below

Iteration 2: Introduce deep learning

We introduced deep learning in the second version to try to understand the semantics. In deep learning, there is a deep network, and each layer of convolutional network can obtain different feature maps. Multi-layer fusion can well extract feature information, enabling the machine to “understand” semantic information.

Iteration 3: Deep learning fusion with traditional image processing algorithms

The target detection method, no matter it is the first-stage mode or the second-stage mode, has the problem of inaccurate location, as shown in the figure below. We tried to integrate the methods of iteration 1 and Iteration 2, and iterated the third version by combining the semantic understanding ability of deep learning and the advantages of high precision of traditional image processing

The results of the analysis

Layout analysis combined with the semantic understanding ability of deep learning and the advantages of high precision of traditional image processing, high accuracy and recall rate can be obtained. The analysis results of more than 1,000 layouts are as follows:

Front-end intelligence: technical difficulties and thinking of visual draft image recognition

What is the image generation code

What does the image generation code do

Technical difficulties and thinking

Iteration 1: Traditional image processing

Iteration 2: Introduce deep learning

Iteration 3: Deep learning fusion with traditional image processing algorithms

The results of the analysis

Related Posts

Front-end Chunk-VENDORS optimization + Nginx configuration

Echarts chart mashup ii

Uniapp introduced SCSS style file substyles do not take effect