preface
The AVFoundation framework is very powerful, but it’s also full of classes like sessions, inputs, outputs. I haven’t combed the whole framework completely before. Although I can complete the requirements of various classes pieced together, it is always difficult to meet the requirements of deep customization and step on the pit. Recently, AFTER reading the book “Learning AV Foundation”, I tried to sort out an overall context, and integrated the knowledge points I understood into a Swift rewritten demo, including shooting + real-time filter + real-time write + custom export resolution, etc. This is the video shooting part, the next chapter will update the video composition editing and audio mixing part.
The overall structure
It can be seen that the whole process can be divided into three parts: data acquisition, data processing and data preservation.
Data collection: Data collection is not limited to cameras and mics in this scenario, but is also read from AVAsset instances such as AVAssetReader. The output of this phase is CMSampleBuffer, regardless of how it is collected. It is worth mentioning that the camera collects compressed video signals in YUV format, and the output should be restored to digital signals that can be processed, which is configured in the videoSettings property of AVCaptureVideoDataOutput.
Data processing: Data processing stage can be based on CMSampleBuffer for various processing, including adding filters, etc., all in this stage. The sampleBuffer will contain a CVPixelBuffer, which is a Core Video object with raw pixel data for a single Video frame from which we can work at the pixel level.
Data preservation: In the data preservation stage, the processed media resources are encoded and written into container files, such as MP4 files or.mov files. AVAssetWriter is used here, which supports real-time writes. The data it expects to receive is also in CMSampleBuffer format. We can also pass in other data and use pixelBufferAdaptor to adapt it to the desired data. For example, if the CIImage is passed into the demo of this article, you can render the CIImage into CVPixelBuffer before writing it.
The above is the overall context, and the demo of this article is also based on this idea encapsulation. CaptureManager takes care of the data collection and outputs CMSampleBuffer data via block callbacks. VideoWriteManager is responsible for data storage, receives CMSampleBuffer data, and generates file paths through block callbacks. The CameraViewController acts as the scheduler of the two and is responsible for processing the data and presenting the processed data to the user for preview.
Equipment acquisition
The core class collected by the device is AVCaptureSession. Its input is AVCaptureDeviceInput and its output is AVCaptureOutput. It manages the data flow from the physical device and outputs the specified files or data according to the configuration of output. AVCaptureOutput is an abstract base class. If customization is not high, you can use the AVCaptureMovieOut advanced class to output files directly. However, if you need to do low-level data processing or custom configuration, you need to use AVCaptureVideoDataout and AVCaptureAudioDataOutput to output the raw data in CMSampleBuffer format. Additionally, in order not to block the main thread, we typically assign a special serial queue to AVCaptureOutput.
Data processing – Add filters
Most filters use the GPUImage framework, but this is not the core of this article. Therefore, the Demo only uses the CoreImage framework to achieve the filter effect. In order to achieve the effect of real-time filter, it is necessary to apply the effect of the current filter to the image data of each frame in the callback data of each frame, so that the user can constantly switch between various filters in the process of shooting.
Here, when learning the demo in the book “Learning AV Foundation”, the demo in the book is to pass the original data to the user preview interface and write class respectively, and the two are processed separately. But I think it’s like doing the same thing twice, which is not very good in terms of code maintenance or performance. Because, here I am finished processing, processed data respectively to the preview interface display and write class save.
Save the data
AVAssetWriter is configured through multiple AVAssetWriterInput objects (audio, video, and so on). AVAssetWriterInput is initialized via mediaType and outputSettings, where you can configure the video bit rate, width and height, keyframe interval, and so on. This is also an obvious advantage of AVAssetWrite over AVAssetExportSession. AVAssetWriterInput appends data to generate a separate AVAssetTrack on final output.
Here PixelBufferAdaptor is used to attach CVPixelBuffer type data, which provides optimal performance when attaching video samples of CVPixelBuffer objects.
The above is the overall idea and context of AVFoundation Caputure. For more details of pit mining, such as video rotation problems, please refer to the Demo with detailed notes.