The Shape Detection API has been available for some time, and its main capability is to give the front end a directly usable interface for feature Detection (including bar code, face, and text Detection). This article will simply introduce it, the front-end face detection for universal explanation. (This article does not talk about algorithm ~ hope to tap)

1 Background and scene


Face Detection is an old topic and is widely used in many industries, such as finance, security, e-commerce, smart phones, entertainment pictures and other industries. The technologies involved are also evolving. Here are a few ideas:

  1. Feature based face detection

For example, OpencV has a built-in Harr classifier based on viola-Jones target detection framework. You only need to load a configuration file (HaarCascade_frontalface_alt.xml) to call detectObject directly to complete the detection process. Detection of other features (nose, mouth, etc.) is also supported.

  1. Face detection based on learning, in fact, is also the need to extract the local features of the image through the operator, through its classification, statistics, regression and other ways to get a more accurate and fast response classifier.

2 sets of highlights


2.1 Back-end processing

The front-end transmits resources to the back-end through the network. The back-end processes the images or video streams that need to be detected in a unified manner, which poses certain challenges to the back-end architecture. Meanwhile, the network delay often fails to bring real-time interaction effects to users.

2.2 Client Processing

Thanks to OpenCV’s cross-language and cross-platform advantages, the client can also provide face detection capabilities with low development costs, and can provide services to the Web container via JsBridge and other means. However, once out of the container, isolated pages will lose this capability. Until one day…

2.3 Open Service

Somewhere along the way, concepts like cloud computing took off and the cost of computing got cheaper. Each major research and development team (such as ali cloud, Face++) are ready to move and leisurely shelves face detection services, and even brought a variety of special! Very special! Suit! Service! , face recognition, liveness recognition, DOCUMENT OCR and face comparison, etc.

Although there are not only client-side SDKS and front and back apis, I would like to talk about my pure front-end solution anyway.

What does the era bring


Well, face recognition in the front is still in slash-and-burn ancient times, however, our infrastructure has started, I hope that some of the subsequent introduction can bring certain inspiration for you.

3.1 Shape Detection API

With the gradual improvement of the computing capability of the client hardware, more and more permissions are obtained at the browser level. As image processing requires a large amount of computing resources, in fact, the browser can also undertake some work of image Detection, so the Shape Detection API was developed.

The following simple examples show the basic usage. Before attempting to edit and run the code, make sure that the new feature is enabled on your Chrome version and that the API is restricted by the same origin policy:

chrome://flags/#enable-experimental-web-platform-features

  • Barcode Detection (For Chrome 56+)



var barcodeDetector = new BarcodeDetector();
barcodeDetector.detect(image)
  .then(barcodes= > {
    barcodes.forEach(barcode= > console.log(barcodes.rawValue))
  })
  .catch(err= > console.error(err));Copy the code
  • Face Detection (For Chrome 56+)



var faceDetector = new FaceDetector();
faceDetector.detect(image)
  .then(faces= > faces.forEach(face= > console.log(face)))
  .catch(err= > console.error(err));Copy the code
  • Text Detection (For Chrome 58+)



var textDetector = new TextDetector();
textDetector.detect(image)
  .then(boundingBoxes= > {
    for(let box of boundingBoxes) {
      speechSynthesis.speak(new SpeechSynthesisUtterance(box.rawValue));
    }
  })
  .catch(err= > console.error(err));Copy the code

3.2 Face detection in images

Image face detection is relatively simple, just need to pass in a picture element, can directly tune up the API face recognition. And then catch the canvas and we can display the results of the inspection.

The core code is as follows:



var image = document.querySelector('#image');
var canvas = document.querySelector('#canvas');

var ctx = canvas.getContext("2d");
var scale = 1;

image.onload = function () {
  ctx.drawImage(image,
    0.0, image.width, image.height,
    0.0, canvas.width, canvas.height);

  scale = canvas.width / image.width;
};
function detect() {
  if (window.FaceDetector == undefined) {
    console.error('Face Detection not supported');
    return;
  }

  var faceDetector = new FaceDetector();
  console.time('detect');
  return faceDetector.detect(image)
    .then(faces= > {
      console.log(faces)
      // Draw the faces on the <canvas>.
      var ctx = canvas.getContext("2d");
      ctx.lineWidth = 2;
      ctx.strokeStyle = "red";
      for (var i = 0; i < faces.length; i++) {
        var item = faces[i].boundingBox;
        ctx.rect(Math.floor(item.x * scale),
          Math.floor(item.y * scale),
          Math.floor(item.width * scale),
          Math.floor(item.height * scale));
        ctx.stroke();
      }
      console.timeEnd('detect');
    })
    .catch((e) = > {
      console.error("Boo, Face Detection failed: " + e);
    });
}Copy the code

3.3 Face detection in videos

The face detection in the video is not much different from the image. Through getUserMedia, the camera can be opened to obtain the information of the video/microphone. Through the detection and display of the video frame, the face detection in the video can be realized.

The core code is as follows:



navigator.mediaDevices.getUserMedia({
    video: true.// audio: true
  })
    .then(function (mediaStream) {
      video.src = window.URL.createObjectURL(mediaStream);
      video.onloadedmetadata = function (e) {
        // Do something with the video here.
      };
    })
    .catch(function (error) {
      console.log(error.name);
    });

  setInterval(function () {
    ctx.clearRect(0.0, canvas.width, canvas.height);
    ctx.drawImage(video, 0.0);
    image.src = canvas.toDataURL('image/png');
    image.onload = function() { detect(); }},60);Copy the code

3.4 Go back in time to the days when there were no apis

In fact, many solutions existed a long, long time ago. Due to hardware conditions and no hardware acceleration and other limitations, it has not been widely put into production.

  1. tracking.js

Tracking. Js is a jS-wrapped image processing library that brings rich algorithms and technologies related to computational vision to the browser. It can realize color tracking, face detection and other functions, specific features are as follows:

  1. jquery.facedetection

Facedetection is a jquery/zepto facedetection plug-in based on the cross-terminal CCV image classifier and detector.

2.5 the Node. Js & OpenCv

The Node-OpencV module has been around for a few years and is perfectly compatible with OpencV V2.4.x, although it is not yet fully compatible with V3.x and provides limited apis. The arrival of N-API may bring more surprises.

Imagine in a Electron or Node-WebKit container, can we realize real-time face detection by locally enabling websocket service? The implementation of the idea of the code is as follows:

  • Back-end processing logic



import cv from 'opencv';

const detectConfigFile = './node_modules/opencv/data/haarcascade_frontalface_alt2.xml';

// camera properties
const camWidth = 320;
const camHeight = 240;
const camFps = 10;
const camInterval = 1000 / camFps;

// face detection properties
const rectColor = [0.255.0];
const rectThickness = 2;

// initialize camera
const camera = new cv.VideoCapture(0);

camera.setWidth(camWidth);
camera.setHeight(camHeight);

const frameHandler = (err, im) = > {
  return new Promise((resolve, reject) = > {
    if (err) {
      return reject(err);
    }
    im.detectObject(detectConfigFile, {}, (error, faces) => {
      if (error) {
        return reject(error);
      }
      let face;
      for (let i = 0; i < faces.length; i++) {
        face = faces[i];
        im.rectangle([face.x, face.y], [face.width, face.height], rectColor, rectThickness);
      }
      return resolve(im);
    });
  });
};

module.exports = function (socket) {
  const frameSocketHanlder = (err, im) = > {
    return frameHandler(err, im)
      .then((img) = > {
        socket.emit('frame', {
          buffer: img.toBuffer(),
        });
      });
  };
  const handler = (a)= > {
    camera.read(frameSocketHanlder);
  };
  setInterval(handler, camInterval);
};Copy the code
  • Front-end call interface



socket.on('frame'.function (data) {
  var unit8Arr = new Uint8Array(data.buffer);
  var str = String.fromCharCode.apply(null, unit8Arr);
  var base64String = btoa(str);

  img.onload = function () {
    ctx.drawImage(this.0.0, canvas.width, canvas.height);
  }
  img.src = 'data:image/png; base64,' + base64String;
});Copy the code

4 summarizes


4.1 Future development

These cutting-edge technologies will get more extensive application in the front and support is beyond doubt, the future of the image on the front end will also with the traditional image processing – > learning + image processing way, all the credit for leave the infrastructure (hardware, browser, tools, libraries, etc.) gradually strengthen and perfect, including but not limited to:

  • GetUserMedia /Canvas => image/video operation

  • Shape Detection API => Image Detection

  • Web Workers => Parallel computing capability

  • ConvNetJS => Deep learning framework

4.2 Is actually not that optimistic

2 accuracy

The recognition rate of the front face (multiple faces) is relatively high, but the detection effect is not ideal when there are obstacles on the side face.

4.2.2 Processing speed

For example 2.2 of face detection in images, it takes 300ms+ (in fact, it cannot meet the real-time processing of large resolution video), which is three times faster than the detection speed of 100ms by calling Opencv.

Holdings features

There is still a lot to be done: such as not supporting eyewear status, gender, age estimation, facial recognition, race, smile, blur detection and other services provided by mainstream service providers.

4.3 Want to say and say endless

The source code has not been organized for the time being, and the work is saturated, which will be shared later: github.com/x-cold there is no data support for the adaptability of face detection in different scenarios and the detection time. Later, we will consider introducing PASCAL VOC and samples provided by AT&T for small-scale test. Team recruitment, front-end and designer school recruitment agency recruitment to: [email protected]

5 reference


  1. Face recognition technology summary (1) : Face Detection & Alignment: blog.jobbole.com/85783/

  2. Real person authentication technology in Alibaba live broadcast prevention and control: xianzhi.aliyun.com/fo…

  3. What can the front end do in the age of ARTIFICIAL intelligence? : yq.aliyun.com/article…

  4. ConvNetJS Deep Learning in your browser:cs.stanford.edu/people…

  5. Face Detection using Shape Detection API: Paul.kinlan. me/ Face -d…