Hi ~ I am front end apprentice Yefeng (@malpor), today to bring you a core front end intelligent tutorial, real hand teach you to use machine learning to build a pure front end running icon intelligent recognition tool. With the complete code attached, come to experience the charm of front-end intelligence ~
background
Current front-end component libraries use Iconfont to manage ICONS. As time goes on, more and more ICONS are named in a variety of ways, which is difficult to constrain. Developers often have to sift through hundreds of ICONS to find the right one. Sometimes even the designer can’t find it, leading to repeated ICONS.
Recently, it was found that there is a function of searching ICONS by images on AntDesign official website. Users can click/drag/paste and upload the screenshot of ICONS in design draft or any picture to search for the ICONS with the highest matching degree: AntDesign Icon, functional developer’s article
This function solves the above problems well, but it still has some shortcomings:
- Screenshots should be square, otherwise the recognition rate will decrease after stretching (explained later).
- Only AntDesign ICONS are recognized.
To solve these problems, we decided to build our own front-end icon recognition tool. The following will take Cloud Design, our team’s open source component library, as an example to teach you how to build a pure front-end exclusive icon recognition tool. (Complete code at the end of the article)
Introduction of the term
Just a quick introduction to a few terms that you can skip if you know them.
Machine learning
Machine learning studies and builds special algorithms (rather than a specific algorithm) that allow computers to make predictions by learning from data themselves.
Therefore, machine learning is not a specific algorithm, but a general term for many algorithms.
Machine learning includes: linear regression, Bayes, clustering, decision trees, deep learning and so on. The previous AntDesign model was obtained by deep learning representative algorithm CNN training.
CNN convolutional neural network
Convolutional Neural Networks (CNN) are a kind of Feedforward Neural Networks with Convolutional computation and deep structure, which are most commonly used to analyze visual images.
CNN can effectively reduce the dimension of images with large data volume to small data volume and retain image features, which is very suitable for image data processing. Even if the image is flipped, rotated or transformed position can also be effectively recognized, commonly used to solve: image classification retrieval, target positioning monitoring, face recognition and so on.
Let’s get started
We need to recognize ICONS, which is a classic “image classification” problem in machine learning. CNN(convolutional neural network) can recognize ICONS effectively, but it cannot adapt to scenes of stretching and deformation. Because the model input needs to transform the image into a square size first, the incorrect size of the screenshot will lead to image stretching deformation, reduce the recognition rate, and even recognition errors.
There are two common solutions:
1. Pure machine learning: make the model adapt to the deformed image by adding samples in different tensile states.
2. Machine learning + image processing: Image processing algorithm is used to clip data to ensure that the image is close to the square.
The first method needs to generate a large amount of training data, the training speed is slow, and the situation of stretch deformation is difficult to traverse. The second method only requires simple image processing to effectively improve the recognition rate, so I chose it. The final workflow should look like this:
Next, I will introduce the complete process from three parts: sample generation, model training and model use.
Samples to generate
The training samples of image classification are all pictures, and our ICONS are rendered by Iconfont on the page. It is natural to use a sample page + Puppeteer screenshots to generate a sample. But screenshots were slow and I didn’t want to use Faas, so I came up with a locally generated solution:
First, manually convert the CSS part of the icon library to JS:
This allows you to draw the icon as text on the canvas and crop the surrounding white space using an image algorithm:
// Draw ICONS with an off-screen canvas
offscreenCtx.font = `20px NextIcon`;
offscreenCtx.fillText(labelMap[labelName]);
// Use getImageData to get the image data and calculate the coordinates to crop
const { x, y, width: w, height: h } = getCutPosition(canvasSize, canvasSize, offscreenCtx.getImageData(0.0, canvasSize, canvasSize).data);
// Calculate the coordinates to be clipped
function getCutPosition(width, height, imgData) {
let lOffset = width; let rOffset = 0; let tOffset = height; let bOffset = 0;
// Walk through the pixel to get the smallest non-blank rectangle area
for (let i = 0; i < width; i++) {
for (let j = 0; j < height; j++) {
const pos = (i + width * j) * 4;
if (notEmpty(imgData[pos], imgData[pos + 1], imgData[pos + 2], imgData[pos + 3]) {// Adjust lOffset, rOffset, tOffset, bOffset
/ / a little}}}// If the shape is not square, expand it to square
const r = (rOffset - lOffset) / (bOffset - tOffset);
if(r! = =1) {
/ / a little
}
return { x: lOffset, y: tOffset, width: rOffset - lOffset, height: bOffset - tOffset };
}
// The threshold is 0-255
const d = 5;
// Check whether there are non-blank pixels
function notEmpty(r, g, b, a) {
return r < 255 - d && g < 255 - d && b < 255 - d;
}
// Crop & scale the image with canvas and export it to base64
ctx.drawImage(offscreenCanvas, x, y, w, h, 0.0.96.96);
canvas.toDataURL('image/jpeg');
Copy the code
The logic for generating an image is done. To modify it, iterate over different ICONS and font sizes to get a full sample:
const fontStep = 1;
const fontSize = [20.96];
labels.map((labelName) = > {
// Iterate over different sizes to draw ICONS
for (let i = fontSize[0]; i <= fontSize[1]; i += fontStep) {
/ /... before
offscreenCtx.font = `${i}px NextIcon`;
// Other logic}});Copy the code
Download the data as a JSON via Blob:
const resultData = /* Generate full data */;
const aLink = document.createElement('a');
const blob = new Blob([JSON.stringify(resultData, null.2)] and {type : 'application/json' });
aLink.download = 'icon.json';
aLink.href = URL.createObjectURL(blob);
aLink.click();
Copy the code
The result is a large JSON with tens of thousands of sample images (350 ICONS, about 70 images per category) that looks something like this:
[{"name": "smile"."data": [{"url": "data:image/jpeg; base64,/9j/4AA... IkB//9k="."size": 20
},
{
"url": "data:image/jpeg; base64,/9j/4AA... JAf//Z"."size": 21},... ] },]Copy the code
Finally, write a simple Node program to split and store the samples of each classification into picture files according to the proportion of 70% training set, 20% verification set and 10% test set.
--- train
|-- smile
|-- smile_3.jpg
|-- smile_7.jpg
|-- cry
|-- cry_2.jpg
|-- cry_8.jpg
...
--- validation
|-- smile
|-- cry
...
--- test
|-- smile
|-- cry
...
Copy the code
In this way, we get a complete training sample, and the generation speed is very fast, it only takes about 1 minute to run once. Then pack the three directories together into a ZIP file, because the next training only supports zip format.
Model training
There are a variety of machine learning tools, and as a front end, I ended up using Pipcook for training.
The Pipcook project is an open source toolset that enables Web developers to better use machine learning to unlock and accelerate the era of front-end intelligence!
Pipcook is only available for Mac and Linux, and is currently not available for Windows (you can use Tensorflow.js for Windows).
Write a configuration item for pipcook:
{
"plugins": {
"dataCollect": {
"package": "@pipcook/plugins-image-classification-data-collect"."params": {
"url": "File:// absolute path pointing to the last package.zip"}},"dataAccess": {
"package": "@pipcook/plugins-pascalvoc-data-access"
},
"dataProcess": {
"package": "@pipcook/plugins-tfjs-image-classification-process"."params": {
"resize": [224.224]}},"modelDefine": {
"package": "@pipcook/plugins-tfjs-mobilenet-model-define"."params": {}},"modelTrain": {
"package": "@pipcook/plugins-image-classification-tfjs-model-train"."params": {
"batchSize": 64."epochs": 12}},"modelEvaluate": {
"package": "@pipcook/plugins-image-classification-tfjs-model-evaluate"}}}Copy the code
Start training with Pipcook’s accompanying Cli tools:
$pipcook run Indicates the configuration item written above. JsonCopy the code
Seeing the words Epochs and Iteration indicates that the training has started successfully.
. ℹ [job] running modelTrain start ℹ start loading plugin@pipcook /plugins-image-classification- TFJs-model -train ℹ @pipcook/plugins-image-classification- tfJs-model-train plugin is loaded ℹ Epoch 0/12 start ℹ Iteration 0/303 result -- Loss: 5.969481468200684 accuracy: 0 ℹ Iteration 30/303 result -- Loss: 5.65574312210083 accuracy: 0.015625 ℹ Iteration 60/303 result -- loss: 5.293442726135254 Accuracy: 0.0625 ℹ Iteration 90/303 result -- loss: 4.970404624938965 accuracy: 0.03125...Copy the code
Training with the above parameters on my Mac would take about two hours, during which time my computer’s CPU resources would be used, so find the free time to train. Pipcook job Stop
The duration of training is related to the amount of sample data, EPOCHS and batchSize.
/* =============== two hours later… = = = = = = = = = = = = = = = * /
When training is complete, you can see the final loss rate (the lower the better) and accuracy (the higher the better) :
...
ℹ [job] running modelEvaluate start
ℹ start loading plugin @pipcook/plugins-image-classification-tfjs-model-evaluate
ℹ @pipcook/plugins-image-classification-tfjs-model-evaluate plugin is loaded
ℹ Evaluate Result: loss: 0.05339580587460659 accuracy: 0.9850694444444444
...
Copy the code
If the loss rate is greater than 0.2 and the accuracy is lower than 0.8, the training effect is not very good, and parameters or samples need to be adjusted, and then re-training.
Pipcook will also create an Output folder in the same directory as the configuration item JSON, which contains the model we need:
The output log folder | | - logs # training - model # model folder, Inside the product of the two files is eventually need | -- weights. Bin | -- model. Json | -- metadata. Json # meta information | -- package. Json # project information | -- index. Js file | - # default entry Boapkg.js # Auxiliary fileCopy the code
Model USES
The model can be run directly on the front page because the underlying Pipcook plug-in calls tensorflow.js for training.
Let’s start by storing model.json and sellers. bin in the same directory. Json output. Dataset = labelArray = output. Dataset = output.
// The current order is generated randomly, not in the same order as when the sample is generated, don't confuse it
const labelArray = ["col-before"."h1"."solidDown"."add-test". ] ;Copy the code
All you need to do is write some more tensorflow.js code to identify it.
import * as tf from '@tensorflow/tfjs';
const modelUrl = 'Model. json access address';
// Load the model
model = await tf.loadLayersModel(modelUrl);
// Crop the input image
const { x, y, width: w, height: h } = getCutPosition(imgW, imgH, offscreenCtx.getImageData(0.0, imgW, imgH).data, 'white');
ctx.drawImage(offscreenCanvas, x, y, w, h, 0.0, cutSize, cutSize);
// Images translate to tensor
const imgTensor = tf.image
.resizeBilinear(tf.browser.fromPixels(canvas), [224.224])
.reshape([1.224.224.3]);
// Model recognition
const pred = model.predict(imgTensor).arraySync()[0];
// Find the 5 items with the highest similarity
const result = pred.map((score, i) = > ({ score, label: labelArray[i] }))
.sort((a, b) = > b.score - a.score)
.slice(0.5);
Copy the code
You’re done
Now you can begin to experience the power of icon recognition and enjoy the convenience of machine learning. This is a pure front-end tool that requires no additional back-end services and can be deployed on static web sites, making it ideal for finding ICONS in component library web sites. It’s perfectly fine for the team to have their own icon library, as long as you follow the steps, you can train your own model.
The complete code is at: github.com/maplor/icon…
conclusion
It took a weekend and two nights to get the model up and running, and a large proportion of the time was spent building the environment and training the model. While Pipcook is easy to use and saves a lot of work, getting started can be tricky: documentation is sparse, plug-in parameters can only be understood by looking at the source code, and there are some unwritten rules that require constant trial and error. Hopefully Pipcook documentation can be updated and maintained in a timely manner.
If you have any questions you can point out in the comments, welcome to experience the exchange ~
Q&A
- What if there are new/modified ICONS in the icon library? A: You need to retrain the model.
The resources
Stanford machine Learning
Tensorflow.js massive ICONS, millisecond level identification!
Tensorflow. Js’s official website
Pipcook website
Understand machine learning
Understand convolutional neural network CNN
Join us
We are the TXD (Experience technology) team of Aliyun, sincerely looking for front-end and designer. The internship recruitment of the 22nd class is also in progress. Interested students can contact me for more information: [email protected]