Andy, Simpleweb (link to original article)
Translation: Liu Tong
Exploring Facial Recognition with WebRTC
This article is reprinted from WebRTC Chinese Website
In recent years, facial recognition technology has been hovering around smartphone innovation. With Apple’s introduction of Face ID on the iPhone X, people are paying more attention to facial recognition technology.
Our team has always enjoyed experimenting and exploring the potential of future technologies. So we worked with Cristiano to do some research and exploration on facial recognition using WebRTC.
In this article, we’ll share some of Cristiano’s work, what he’s learned, and his thoughts on the challenges and limitations of this technology.
What is WebRTC?
WebRTC is an open source web framework that enables real-time communication in the browser. It includes the building blocks for high-quality communication on the Web, such as networking, audio and video components for voice and video chat applications.
In Cristiano’s words, “a way to call the microphone, audio and camera from the browser.”
Cristiano used WebRTC, Canvas, Microsoft Cognitive Services, and Microsoft Emotion API to create a prototype tool that can detect emotions through facial expressions in a browser using a webcam.
explore
To help detect whether facial expressions are happy or sad, Cristiano created a grid that maps different facial points via webcams. As users move their faces, the dots and grids move with them. This information is then sent to Microsoft’s Emotion database, which can then detect whether someone’s expression is sad or happy.
Cristiano tried a number of different libraries to map key points on the face. He ended up with Beyond Reality Face (BRFv4) because it ran on the client side, so wasn’t server dependent, and worked easily in his browser. BRFv4 examines your face, maps key points and grids, and returns many dots on your face. BRFv4 can detect 68 key points on a person’s face. Pretty cool, right?
You can view the different libraries by clicking on the following links.
-
Beyond Reality Face
-
clmtrackr
-
BetfaceAPI
While BRFv4 can do most of the hard work, Cristiano wants to go one step further and customize the mesh and key points to better control it.
To do this, Cristiano uses Canvas, an HTML5 element that allows him to customize selected grids and key points. This allows him to easily change colors, remove lines, and even replace dots with other geometric shapes. Canvas gives Cristiano more options.
The final step in detecting facial expressions is using Microsoft’s Emotion API. It’s a super-large database filled with “mood data” that responds as soon as a video frame is sent to it. Cristiano is able to send each video frame as a “Base64” image to the API and return what facial expression it is. Cristiano also looked at Affectiva’s API, which he believes is better than Microsoft’s because it provides more detailed information. However, this is limited due to browser support and is not suitable for this project.
Challenges and Limitations
-
Cristiano has also encountered some challenges and limitations, mainly around device and browser support issues. According to Cristiano, the main limitations are:
-
Cross-browser support: Not all browser versions support WebRTC, such as Safari earlier than 11.0.
-
Device support: Some devices are completely reliable at supporting WebRTC from time to time. The iPhone 6, for example, sometimes freezes or displays a blank background.
-
API data load: Each video frame sent to Microsoft’s API sends a lot of data, so you must limit the number of frames sent.
-
Require AN HTTPS connection: If you want a shareable prototype, you’ll need to set up an HTTPS connection, which most browsers will allow if it’s on the localhost line.