This article describes an example of using Dlib to detect facial feature points on iOS. Contains compiled Dlib library, video stream face key point detection, photo face key point detection. The effect is shown below
1. Dlib profile
Dlib is a modern C ++ toolkit containing machine learning algorithms and tools for creating complex software in C ++ to solve real-world problems. For main project documentation and API references, see dlib.net or GitHub github.com/davisking/d…
2. Compile Dlib on Xcode
In this step we need to code the Dlib into a static library in Xcode. First, we download the source code for Dlib. Use a Dlib library compiled by someone else and skip to the next step. To accomplish this step, there are the following requirements:
- X11, if not installed click download
- Xcode
- Cmake, if not installed, can be installed via Homebrew
2.1 Download Source Code
GitHub repository in Dlib github.com/davisking/d… Download the source code
2.2 Create an Xcode compilation project for Dlib. Go to the root directory where the Dlib source code is downloaded and run the following command
cd examples/
mkdir build
cd build
cmake -G Xcode ..
cmake --build . --config Release
Copy the code
The build directory will have examples. Xcodeproj and the dlib_build folder, as shown in the figure
Go to the dlib directory, open dlib. xcodeProj, and change the dlib project Settings to the ones shown below
Select the dlib target and view the Settings to make sure that the Settings for the dlib project are the same as those for dlib.xcodeProj, as shown below
Select the dlib target and compile the x86 and ARM static libraries respectively, as shown below
Build x86 static libraries
Compile the ARM static library
Select the compiled dlib static library under Products in the project navigation bar and right-click to find the folder
Back in the previous directory, you can see the emulator static library folder and the real machine static library folder
The dlib static library we need is compiled
3. Create an iOS App for face detection
3.1 Create an Xcode project named DlibDemo
Create a new folder under the project root directory and name it Dlib. Copy the compiled Dlib static libraries to this directory. The compiled ARM static libraries are stored in the lib-iphoneOS directory and the compiled x86 static libraries are stored in the lib-iphonesimulator directory. Copy the dlib folder from the dlib-master directory to this directory as well; Download the model shape_predictor_68_face_landmarks. Dat and copy the model to this directory as well. As shown in the figure below
Right click Add Files to “DlibDemo”… Add this folder to the project.
Then right-click the dlib folder, select Delete, and then Select Remove References. Remove from project, not delete. The dlib directory does not need to be added to the project because the header search path will be added to the dlib directory. If it is added, an error will be reported. Remove lib-iphonesimulator and lib-iphoneOS from the project in the same way. The project will set the path to search for the static library Lib. A. In the Dlib folder of the project, only shape_predictor_68_face_landmarks. Dat is left
3.2 Setting compilation Options
Set HEADER_SEARCH_PATHS to $(PROJECT_DIR)/DlibDemo/Dlib/ to find the Dlib header file
Set LIBRARY_SEARCH_PATHS to $(SRCROOT)/DlibDemo/Dlib/Lib$(EFFECTIVE_PLATFORM_NAME), $(EFFECTIVE_PLATFORM_NAME) is the Xcode macro. When the emulator is compiled, its value is -iphonesimulator. The Lib$(EFFECTIVE_PLATFORM_NAME) value is lib-iphonesimulator, which corresponds to the x86 static library folder of the Dlib. If the value is -iphoneos, Lib$(EFFECTIVE_PLATFORM_NAME) is lib-iphoneOS, which corresponds to the ARM static library folder of Dlib.
Set OTHER_LDFLAGS, add -l”dlib”
Set OTHER_CFLAGS = -dndebug -DDLIB_JPEG_SUPPORT -DDLIB_USE_BLAS -DDLIB_USE_LAPACK -DLAPACK_FORCE_UNDERSCORE
If the project is new, create DlibDemo/ dlibdemo-bridging-header. h and set SWIFT_OBJC_BRIDGING_HEADER = DlibDemo/ dlibdemo-bridging-header. h
Set the Debug environment to Fastest, Smallest[-OS]. I’ve been stuck here for a long time, and if it’s not set, the detection process is very, very slow and takes a very long time. The slowest [-OS] mode is used by default, so we don’t need to set it. After debugging, you can change the Debug mode compilation optimization back to None[-o0] to avoid affecting debugging.
Add dependent FrameWork, Accelerate. FrameWork
4. Use Dlib in your project
4.1 Wrapper Dlib
Since Dlib is written in C++, we need to create dlibwrapper. h and dlibwrapper. mm files to Wrapper Dlib. Dlibwrapper. h is used to expose the method name. Do not import any dlib-related header files, otherwise any files that introduce dlibwrapper. h will be implemented as.mm. Dlibwrapper. mm introduces dlib-related headers and implements related methods.
The code for dlibwrapper.h is as follows
#import <Foundation/Foundation.h>
#import <CoreMedia/CoreMedia.h>
@interface DlibWrapper : NSObject
- (instancetype)init;
- (void)prepare;
- (void)doWorkOnSampleBuffer:(CMSampleBufferRef)sampleBuffer inRects:(NSArray<NSValue *> *)rects;
- (void)doWorkOnImagePath:(NSString*)imagePath savePath:(NSString*)savePath;
@end
Copy the code
The code for dlibwrapper.mm is as follows
#import "DlibWrapper.h"
#import <UIKit/UIKit.h>
#include <dlib/image_processing/frontal_face_detector.h>
#include <dlib/image_processing.h>
#include <dlib/image_io.h>
#include <dlib/image_processing/frontal_face_detector.h>
#include <dlib/image_processing/render_face_detections.h>
@interface DlibWrapper ()
@property (assign) BOOL prepared;
+ (std::vector<dlib::rectangle>)convertCGRectValueArray:(NSArray<NSValue *> *)rects;
@end
@implementation DlibWrapper {
dlib::shape_predictor sp;
dlib::frontal_face_detector detector;
}
- (instancetype)init {
self = [super init];
if (self) {
_prepared = NO;
}
return self;
}
- (void)prepare {
NSString *modelFileName = [[NSBundle mainBundle] pathForResource:@"shape_predictor_68_face_landmarks" ofType:@"dat"];
std::string modelFileNameCString = [modelFileName UTF8String];
dlib::deserialize(modelFileNameCString) >> sp;
detector = dlib::get_frontal_face_detector();
// FIXME: test this stuff for memory leaks (cpp object destruction)
self.prepared = YES;
}
- (void)doWorkOnSampleBuffer:(CMSampleBufferRef)sampleBuffer inRects:(NSArray<NSValue *> *)rects {
if (!self.prepared) {
[self prepare];
}
dlib::array2d<dlib::bgr_pixel> img;
// MARK: magic
CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
CVPixelBufferLockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
size_t width = CVPixelBufferGetWidth(imageBuffer);
size_t height = CVPixelBufferGetHeight(imageBuffer);
char *baseBuffer = (char *)CVPixelBufferGetBaseAddress(imageBuffer);
// set_size expects rows, cols format
img.set_size(height, width);
// copy samplebuffer image data into dlib image format
img.reset();
long position = 0;
while (img.move_next()) {
dlib::bgr_pixel& pixel = img.element();
// assuming bgra format here
long bufferLocation = position * 4; //(row * width + column) * 4;
char b = baseBuffer[bufferLocation];
char g = baseBuffer[bufferLocation + 1];
char r = baseBuffer[bufferLocation + 2];
// we do not need this
// char a = baseBuffer[bufferLocation + 3];
dlib::bgr_pixel newpixel(b, g, r);
pixel = newpixel;
position++;
}
// unlock buffer again until we need it again
CVPixelBufferUnlockBaseAddress(imageBuffer, kCVPixelBufferLock_ReadOnly);
// convert the face bounds list to dlib format
std::vector<dlib::rectangle> convertedRectangles = [DlibWrapper convertCGRectValueArray:rects];
// for every detected face
for (unsigned long j = 0; j < convertedRectangles.size(); ++j)
{
dlib::rectangle oneFaceRect = convertedRectangles[j];
// detect all landmarks
dlib::full_object_detection shape = sp(img, oneFaceRect);
// and draw them into the image (samplebuffer)
for (unsigned long k = 0; k < shape.num_parts(); k++) {
dlib::point p = shape.part(k);
draw_solid_circle(img, p, 2, dlib::rgb_pixel(0.255.0)); }}// lets put everything back where it belongs
CVPixelBufferLockBaseAddress(imageBuffer, 0);
// copy dlib image data back into samplebuffer
img.reset();
position = 0;
while (img.move_next()) {
dlib::bgr_pixel& pixel = img.element();
// assuming bgra format here
long bufferLocation = position * 4; //(row * width + column) * 4;
baseBuffer[bufferLocation] = pixel.blue;
baseBuffer[bufferLocation + 1] = pixel.green;
baseBuffer[bufferLocation + 2] = pixel.red;
// we do not need this
// char a = baseBuffer[bufferLocation + 3];
position++;
}
CVPixelBufferUnlockBaseAddress(imageBuffer, 0);
}
- (void)doWorkOnImagePath:(NSString*)imagePath savePath:(NSString*)savePath {
if (!self.prepared) {
return;
}
std::string fileName = [imagePath UTF8String];
//creat image
dlib::array2d<dlib::rgb_pixel> img;
//load ios image
dlib::load_image(img,fileName);
//dlib face recognition
std::vector<dlib::rectangle> dets = detector(img);
NSLog(@"Number of faces %lu",dets.size());// The number of faces detected
for (unsigned long j = 0; j < dets.size(); ++j) {
dlib::full_object_detection shape = sp(img, dets[j]);
// and draw them into the image (samplebuffer)
for (unsigned long k = 0; k < shape.num_parts(); k++) {
dlib::point p = shape.part(k);
// Point p diameter 2 parameter is the origin diameter rgb_pixel color
dlib::draw_solid_circle(img, p, 2, dlib::rgb_pixel(0.255.0));
}
}
dlib::save_jpeg(img, [savePath UTF8String]);
}
+ (std::vector<dlib::rectangle>)convertCGRectValueArray:(NSArray<NSValue *> *)rects {
std::vector<dlib::rectangle> myConvertedRects;
for (NSValue *rectValue in rects) {
CGRect rect = [rectValue CGRectValue];
long left = rect.origin.x;
long top = rect.origin.y;
long right = left + rect.size.width;
long bottom = top + rect.size.height;
dlib::rectangle dlibRect(left, top, right, bottom);
myConvertedRects.push_back(dlibRect);
}
return myConvertedRects;
}
@end
Copy the code
Add #import “dlibwrapper. h” to dlibdemo-bridge-header. h
#import "DlibWrapper.h"
Copy the code
4.2 Write video stream key point detection code
First, create SessionHandler. Swift to get the video stream, and call DlibWrapper for face keypoint detection for each video frame, as shown below
import AVFoundation
class SessionHandler : NSObject.AVCaptureVideoDataOutputSampleBufferDelegate.AVCaptureMetadataOutputObjectsDelegate {
var session = AVCaptureSession(a)let layer = AVSampleBufferDisplayLayer(a)let sampleQueue = DispatchQueue(label: "com.zweigraf.DisplayLiveSamples.sampleQueue", attributes: [])
let faceQueue = DispatchQueue(label: "com.zweigraf.DisplayLiveSamples.faceQueue", attributes: [])
let wrapper = DlibWrapper(a)var currentMetadata: [AnyObject]
override init() {
currentMetadata = []
super.init()}func openSession(a) {
var device = AVCaptureDevice.devices(for: AVMediaType.video)
.map{$0}.filter{$0.position == .front}
.first
if device == nil {
return
}
let input = try! AVCaptureDeviceInput(device: device!)
let output = AVCaptureVideoDataOutput()
output.setSampleBufferDelegate(self, queue: sampleQueue)
let metaOutput = AVCaptureMetadataOutput()
metaOutput.setMetadataObjectsDelegate(self, queue: faceQueue)
session.beginConfiguration()
if session.canAddInput(input) {
session.addInput(input)
}
if session.canAddOutput(output) {
session.addOutput(output)
}
if session.canAddOutput(metaOutput) {
session.addOutput(metaOutput)
}
session.commitConfiguration()
let settings: [AnyHashable: Any] = [kCVPixelBufferPixelFormatTypeKey as AnyHashable: Int(kCVPixelFormatType_32BGRA)]
output.videoSettings = settings as! [String : Any]
// availableMetadataObjectTypes change when output is added to session.
// before it is added, availableMetadataObjectTypes is empty
metaOutput.metadataObjectTypes = [AVMetadataObject.ObjectType.face] wrapper? .prepare() session.startRunning()for output in session.outputs {
for av in output.connections {
if av.isVideoMirroringSupported {
av.videoOrientation = .portrait
av.isVideoMirrored = true
}
}
}
layer.videoGravity = AVLayerVideoGravity.resizeAspectFill
}
// MARK: AVCaptureVideoDataOutputSampleBufferDelegate
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
if! currentMetadata.isEmpty {let boundsArray = currentMetadata
.flatMap { $0 as? AVMetadataFaceObject}.map { (faceObject) -> NSValue in
let convertedObject = output.transformedMetadataObject(for: faceObject, connection: connection)
return NSValue(cgRect: convertedObject! .bounds) } wrapper? .doWork(on: sampleBuffer, inRects: boundsArray) } layer.enqueue(sampleBuffer) }func captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {
print("DidDropSampleBuffer")}// MARK: AVCaptureMetadataOutputObjectsDelegate
func metadataOutput(_ output: AVCaptureMetadataOutput, didOutput metadataObjects: [AVMetadataObject], from connection: AVCaptureConnection) {
currentMetadata = metadataObjects as [AnyObject]}}Copy the code
New VideoScanViewController. Swift, use SessionHandler, the code is as follows
import UIKit
class VideoScanViewController: UIViewController {
let sessionHandler = SessionHandler(a)lazy var preview: UIView = {
let view = UIView(a)return view
}()
override func viewDidLoad(a) {
super.viewDidLoad()
self.navigationItem.title = "Video stream detection of face feature points"
self.view.backgroundColor = .white
self.view.addSubview(preview)
preview.frame = CGRect(x: 0, y: 0, width: self.view.frame.width, height: self.view.frame.height)
}
override func didReceiveMemoryWarning(a) {
super.didReceiveMemoryWarning()
// Dispose of any resources that can be recreated.
}
override func viewDidAppear(_ animated: Bool) {
super.viewDidAppear(animated)
sessionHandler.openSession()
let layer = sessionHandler.layer
layer.frame = preview.bounds
preview.layer.addSublayer(layer)
view.layoutIfNeeded()
}
}
Copy the code
4.3 Write image face key point detection code
New AlbumViewController. Swift, use DlibWrapper to face point detection of photos, the code is as follows
import UIKit
class AlbumViewController: UIViewController {
lazy var picker = UIImagePickerController(a)lazy var imageView: UIImageView = {
let imageView = UIImageView()
imageView.contentMode = UIView.ContentMode.scaleAspectFit
return imageView
}()
lazy var wrapper = DlibWrapper(a)var filePath = ""
var filePathWrite = ""
override func viewDidLoad(a) {
super.viewDidLoad()
self.view.backgroundColor = .white
self.navigationItem.rightBarButtonItem = UIBarButtonItem.init(title: "Album", style: .plain, target: self, action: #selector(albumClick(_:)))
self.view.addSubview(imageView)
imageView.frame = self.view.bounds
let cachePath = NSSearchPathForDirectoriesInDomains(.cachesDirectory, .userDomainMask, true).first!
filePath = (cachePath as NSString).appendingPathComponent("DlibCacheFileRead.jpg")
filePathWrite = (cachePath as NSString).appendingPathComponent("DlibCacheFileWrite.jpg") wrapper? .prepare() }@objc func albumClick(_ button: UIButton) {
let sourceType = UIImagePickerController.SourceType.photoLibrary
picker.delegate = self
picker.sourceType = sourceType
self.present(picker, animated: true, completion: nil)}}extension AlbumViewController: UIImagePickerControllerDelegate.UINavigationControllerDelegate {
func imagePickerController(_ picker: UIImagePickerController, didFinishPickingMediaWithInfo info: [UIImagePickerController.InfoKey : Any]) {
let image = info[UIImagePickerController.InfoKey.originalImage]
picker.dismiss(animated: true, completion: nil)
DispatchQueue.main.async { [weak self] in
if let image = image as? UIImage.let filePath = self? .filePath,let filePathWrite = self? .filePathWrite {let imageData = image.jpegData(compressionQuality: 1.0)
try? imageData? .write(to:URL(fileURLWithPath: filePath))
self? .wrapper? .doWork(onImagePath: filePath, savePath: filePathWrite)let detectImage = UIImage.init(contentsOfFile: filePathWrite)
self? .imageView.image = detectImage } } }func imagePickerControllerDidCancel(_ picker: UIImagePickerController) {
picker.dismiss(animated: true, completion: nil)}}Copy the code
5. Running result
This code can be compiled and run on emulators and real machines. However, there is no camera in the simulator, so the key point detection of video stream face can only be used in the real machine.
6. Summary
This article introduces, compiles the Dlib library, and uses examples in the Xcode project, including video stream face keypoint detection, photo face keypoint detection. It involves static library compilation and use knowledge, Swift and C++ mixing knowledge, AVFoundation knowledge, involving more content and attention points. If it needs to be used in the actual App, it needs to consider model compression, video stream optimization, performance optimization, BitCode and other issues. If you have any questions, you can pay attention to my public number, leave a message and communicate with me, and make progress together. For source code, public reply iOS.
Welcome to scan the code to pay attention to the public number, discuss and exchange together