Suck the cat with code! This paper is participating in[Cat Essay Campaign]
Ml5.js and P5.js are both intended to provide out-of-the-box libraries for creative programming, but ML5 is focused on machine learning, allowing front-end engineers to play with AI without deep knowledge of machine learning.
The library encapsulates commonly used machine learning algorithms and pretraining models. It is based on tensorflow.js and can be used alone or with p5.js.
What is object detection
I want to bring the AI to you, so that you don’t feel that it’s unfathomable and inaccessible. So here we don’t talk too deep, first introduce what is target detection, we may know what is a classification problem, also is to model a picture, in the picture there is only one category, we are interested in is a dog or a cat, through prediction output a picture category, tell the picture is a dog or a cat pictures is so classifier.
When we talk about target detection today, multi-target detection is different. It is to input an image that may have multiple categories. The image is no longer just a dog or a cat, it may be multiple dogs or cats and dogs. Next we use ml5.js to achieve a target detection, detection of the cat.
How to choose the pre-training model
Data is the ceiling of a good or bad model. In recent years, ResNet Inception GoogleNet and ImageNet, a dataset of 2 million pages and 1000 categories, have emerged as excellent models. Ml5.js usually uses pre-trained models, so we need to care about two things when choosing models. What network is the first model based on, MobileNet, ResNet or VGG? This is our concern, because different network models may have different application scenarios and problem-solving capabilities, and there are differences. Then take a look at what data set training is based on. COCO is a classic data set that provides rich data sets for different types of tasks, such as semantic segmentation, key points, posture and so on.
Ml5.js provides two pre-training models for target detection, YOLO and COCOSsd. Here select COCOSsd, here COCO 80 category, which should include today’s hero cat.
Set up the environment
Here we simply build the environment with p5.js
function setup() {
createCanvas(640.480);
}
function draw(){
background(0);
}
Copy the code
Read the pictures
let img;
function preload(){
img = loadImage('dog_and_cat_03.jpeg');
}
function setup() {
createCanvas(640.480);
image(img,0.0);
}
Copy the code
P5.js If you are not familiar with it, you can take a look at the official documentation, relatively simple. In fact, without looking at the document, the code above is clear at a glance. P5.js is designed, and the code readability is relatively good.
Creating a Detector
let img;
let detector;
function preload(){
img = loadImage('dog_and_cat_03.jpeg');
detector = ml5.objectDetector('cocossd');
}
function setup() {
createCanvas(640.480);
console.log(detector);
image(img,0.0);
}
Copy the code
Create a target detector and pass in that we’re going to use the pre-training model, and select CocoSSD. The next step is to use a detector to check the above image.
Starting test
let img;
let detector;
function preload(){
img = loadImage('dog_and_cat_03.jpeg');
detector = ml5.objectDetector('cocossd');
}
function hanldeDetectResult(error,results){
if(error){
console.log(error);
}
console.log(results);
}
function setup() {
createCanvas(640.480);
// console.log(detector);
image(img,0.0);
detector.detect(img,hanldeDetectResult);
}
Copy the code
Get the target detector, an object, and call the detect method, which takes two parameters: the image to detect and a callback function that accepts the results of the processing detection. Because testing is a time-consuming process and may take a while,
Test result information
1. 0:
1. 1. confidence: 0.9443305134773254
1. height: 297.570499420166
1. label: "dog"
1. normalized: {x: 0.3297404646873474.y: 0.13195198774337769.width: 0.3925212621688843.height: 0.6888206005096436}
1. width: 301.4563293457031
1. x: 253.2406768798828
1. y: 57.00325870513916
1. [[Prototype]]: Object
1. 1:
1. 1. confidence: 0.9138157963752747
1. height: 334.8041739463806
1. label: "cat"
1. normalized: {x: 0.08691723644733429.y: 0.09776231646537781.width: 0.31792615354061127.height: 0.7750096619129181}
1. width: 244.16728591918945
1. x: 66.75243759155273
1. y: 42.23332071304321
Copy the code
The categories are explained briefly
- Cnfidence means that the higher confidence the model has in its results, the better, and sometimes we can set threshold to remove results that are less confident in the model
- Height /width Indicates the width and height of the target border
- X, y target upper left corner, this result depends on the actual situation, different frameworks or models may parse different results, may be the center point
- Label just returns the result of the category
The next step is to provide the data to plot the classification and location information onto the image.
for (let i = 0; i < results.length; i++) {
let object = results[i];
stroke(0.255.0);
strokeWeight(4);
noFill();
rect(object.x, object.y, object.width, object.height);
noStroke();
fill(255);
textSize(24);
text(object.label, object.x + 10, object.y + 24);
}
Copy the code