IOS Black Technology (AVFoundation) Dynamic Face recognition
The last article introduced static face recognition implemented by Core Image. Here we introduce dynamic face recognition, one of the powerful features of AVFoundation
First, introduce some ways of face recognition
1. CoreImage
Static face recognition, can recognize photos, images and so on
- Check out the previous blog introduction for details
2. Face++
- Beijing Megvii Technology Co., Ltd. is a new visual service platform, aiming to provide easy-to-use, powerful, platform universal visual services
- Face++ is a new generation of cloud vision service platform, providing a set of world-leading face detection, face recognition, face analysis of visual technology services
- Face++ baidu encyclopedia introduction
- Face++ website
3. OpenCV
- It is composed of a series of C functions and a small number of C++ classes. It implements many general algorithms in image processing and computer vision, and others are not well understood
- This is the content of Baidu Encyclopedia
4. Vision
- Vision is an image recognition framework based on CoreML that Apple launched with iOS 11 at WWDC 2017
- According to theSee the official document of Vision.
Vision
Itself hasFace Detection and Recognition
(Face detection and recognition),Machine Learning Image Analysis
(Machine learning image analysis),Barcode Detection
(Bar code detection),Text Detection
(Text detection)… And so on - Interested students can view the relevant documents to learn, here xiaobian will not be introduced
5. AVFoundation
- Can be used to use and create a framework for time-based audio-visual media
- The face recognition method we use here is also used
AVFoundation
The framework
Ii. A brief introduction to key classes
1. AVCaptureDevice
: indicates the hardware device
- From this class we can get the camera, sound sensor, and so on for the phone hardware.
- When we need to change the properties of some hardware devices in the application (such as switching the camera, changing the flash mode, changing the camera focus), we must lock the device first and unlock the device after the modification.
- Example: Switch the camera
//4. Remove old input and add new input
//4.1 Locking the device
session.beginConfiguration()
//4.2. Remove old devices
session.removeInput(deviceIn)
4.3 Adding a New device
session.addInput(newVideoInput)
4.4 Unlock the device
session.commitConfiguration()
Copy the code
2. AVCaptureDeviceInput
: Device input data management object
- According to
AVCaptureDevice
Create the corresponding AVCaptureDeviceInput object, - This object will be added to the AVCaptureSession management, representing the input device, which configudes the ports for the abstract hardware device. Common input devices are (microphone, camera, etc.)
3. AVCaptureOutput
: indicates output data
- The output can be a picture (
AVCaptureStillImageOutput
) or video (AVCaptureMovieFileOutput
)
4. AVCaptureSession
: Media (audio and video) capture sessions
- Responsible for the capture of audio and video data output to the output equipment.
- a
AVCaptureSession
You can have multiple inputs or outputs. - Is the connection
AVCaptureInput
andAVCaptureOutput
Bridge, which coordinates the transfer of data between input and output. - It has two methods, startRunning and stopRunning, to start and end a session.
- Each session is called a session, that is, if you need to change some configurations of the session (for example, switching the camera) during the application running, you need to enable the configuration first and submit the configuration after the configuration is complete.
5. AVCaptureVideoPreviewLayer
: Image preview layer
- How do our photos and videos show up on our phones? That’s by adding this object to
UIView
thelayer
On the
Well, the above said so much nonsense, so our face recognition is how to achieve it? Here comes the dry stuff
Add scanning devices
- Access device (camera)
- Creating input Devices
- Create scan output
- Create a capture callback
1. Output device
- Used here
AVCaptureMetadataOutput
, can scan face, TWO-DIMENSIONAL code, bar code and other information - The proxy must be set; otherwise, the scan result cannot be obtained
- What kind of data do you want to output: face, QR, etc.
//3. Create an output object for the original data
let metadataOutput = AVCaptureMetadataOutput(a)//4. Set the agent to listen to the output data of the object and refresh it in the main thread
metadataOutput.setMetadataObjectsDelegate(self, queue: DispatchQueue.main)
//7. Tell the output object to output what kind of data, face recognition, up to 10 faces can be recognized
metadataOutput.metadataObjectTypes = [.face]
Copy the code
The main code is as follows:
fileprivate func addScaningVideo(){
//1. Get input device (camera)
guard let device = AVCaptureDevice.default(for: .video) else { return }
//2. Create an input object based on the input device
guard let deviceIn = try? AVCaptureDeviceInput(device: device) else { return }
deviceInput = deviceIn
//3. Create an output object for the original data
let metadataOutput = AVCaptureMetadataOutput(a)//4. Set the agent to listen to the output data of the object and refresh it in the main thread
metadataOutput.setMetadataObjectsDelegate(self, queue: DispatchQueue.main)
//4.2 Setting the output proxy
faceDelegate = previewView
//5. Set output quality (high pixel output)
session.sessionPreset = .high
//6. Add input and output to the session
if session.canAddInput(deviceInput!) {
session.addInput(deviceInput!)
}
if session.canAddOutput(metadataOutput) {
session.addOutput(metadataOutput)
}
//7. Tell the output object to output what kind of data, face recognition, up to 10 faces can be recognized
metadataOutput.metadataObjectTypes = [.face]
//8. Create a preview layer
previewLayer = AVCaptureVideoPreviewLayer(session: session)
previewLayer.videoGravity = .resizeAspectFill
previewLayer.frame = view.bounds
previewView.layer.insertSublayer(previewLayer, at: 0)
//9. Set valid scan area (the entire screen area by default) (each value is 0 to 1, starting from the upper right corner of the screen)
metadataOutput.rectOfInterest = previewView.bounds
//10. Start scanning
if! session.isRunning { DispatchQueue.global().async {self.session.startRunning()
}
}
}
Copy the code
2. Switch the camera
- Gets the current camera direction
- Create a new input input
- Remove old input
capture
To add new inputcapture
- The specific code is as follows:
@IBAction func switchCameraAction(_ sender: Any) {
//1. Execute the transition animation
let anima = CATransition()
anima.type = "oglFlip"
anima.subtype = "fromLeft"
anima.duration = 0.5
view.layer.add(anima, forKey: nil)
2. Obtain the current camera
guard let deviceIn = deviceInput else { return }
let position: AVCaptureDevice.Position = deviceIn.device.position == .back ? .front : .back
//3. Create new input
let deviceSession = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInWideAngleCamera], mediaType: .video, position: position)
guard let newDevice = deviceSession.devices.filter({ $0.position == position }).first else { return }
guard let newVideoInput = try? AVCaptureDeviceInput(device: newDevice) else { return }
//4. Remove old input and add new input
//4.1 Locking the device
session.beginConfiguration()
//4.2. Remove old devices
session.removeInput(deviceIn)
4.3 Adding a New device
session.addInput(newVideoInput)
4.4 Unlock the device
session.commitConfiguration()
//5. Save the latest input
deviceInput = newVideoInput
}
Copy the code
3. Process the scan result
Implement AVCaptureMetadataOutputObjectsDelegate the agreement of the method (one way only)
// 'metadataObjects' is an optional public func metadataOutput(_ output: AVCaptureMetadataOutput, didOutput metadataObjects: [AVMetadataObject], from connection: AVCaptureConnection)Copy the code
4. AVMetadataFaceObject
introduce
faceID
: Unique identifier of a face- Each person that came out of the scan, there was a different
faceID
- The same person, in different states (shaking head, tilting head, head up, etc.), will be different
faceID
- Each person that came out of the scan, there was a different
hasRollAngle
: Tilt Angle, roll Angle (left and right head tilt)(BOOL type)rollAngle
: The Angle of the roll (CGFloat
Type)hasYawAngle
: Is there any deviation Angle (shaking head from left to right)yawAngle
: Deflection Angle
5. Process the scan result
5.1 Get the face array of preview layer
- Traverse the scanned array of faces and convert it into an array of faces in the preview layer
- This is mainly the conversion of the face to the left of the layer
- Returns the transformed array
fileprivate func transformedFaces(faceObjs: [AVMetadataObject]) -> [AVMetadataObject] {
var faceArr = [AVMetadataObject] ()for face in faceObjs {
// Convert the scanned face object to the face object in the preview layer (mainly coordinate conversion)
if let transFace = previewLayer.transformedMetadataObject(for: face){
faceArr.append(transFace)
}
}
return faceArr
}
Copy the code
5.2 Add a red box according to the position of the face
- Set the frame of the red box
faceLayer? .frame = face.boundsCopy the code
- According to the Angle of deflection Angle and inclination Angle
CATransform3D
fileprivate func transformDegress(yawAngle: CGFloat) - >CATransform3D {
let yaw = degreesToRadians(degress: yawAngle)
// Rotate around the Y axis
let yawTran = CATransform3DMakeRotation(yaw, 0.- 1.0)
// Red box rotation problem
return CATransform3DConcat(yawTran, CATransform3DIdentity)}// Handle the deflection problem
fileprivate func transformDegress(rollAngle: CGFloat) - >CATransform3D {
let roll = degreesToRadians(degress: rollAngle)
// Rotate around the z-axis
return CATransform3DMakeRotation(roll, 0.0.1)}// Angle conversion
fileprivate func degreesToRadians(degress: CGFloat) - >CGFloat{
return degress * CGFloat(Double.pi) / 180
}
Copy the code
- Rotate the red box according to the unbiased Angle and tilt Angle
//3.4 Set deflection Angle (shake head left and right)
if face.hasYawAngle{
let tranform3D = transformDegress(yawAngle: face.yawAngle)
// matrix processingfaceLayer? .transform =CATransform3DConcat(faceLayer! .transform, tranform3D) }//3.5 Set the tilt Angle and side Angle (left and right head tilting)
if face.hasRollAngle{
let tranform3D = transformDegress(rollAngle: face.rollAngle)
// matrix processingfaceLayer? .transform =CATransform3DConcat(faceLayer! .transform, tranform3D) }Copy the code
- At this point, the dynamic face recognition is completed, will increase the red box display in the face position, and the red box will be dynamic according to the position of the face, real-time adjustment
- Grab your camera and test it out
GitHub–The Demo address
- Note:
- Here is just a list of the main core code, the specific code logic please refer to demo
- If the relevant introduction of some places in the article is not very detailed or better advice, welcome to contact xiaobian
Other related articles
- Generation, recognition and scanning of Swift qr code
- IOS Black Technology (CoreImage) Static face recognition (a)
- Vision image recognition framework for Swift