“This is my fourth day of participating in the First Challenge 2022. For more details: First Challenge 2022.”

  • First paste Demo learning address

VideoToolBox

  • In WWDC 2014, Apple released iOS 8.0, which opened the hard codec API, namely VideoToolBox
  • VideoToolbox is a basic video hard codec framework, pure C language API, direct access to hardware codecs. Provides extremely high performance hardcoding and harddecoding, as well as format conversion of images stored in the CoreVideo pixel buffer.
  • This article documents a tool for encapsulating a hard-decoded H264 video based on VideoToolBox, corresponding to the hard-coding tool in the previous article

Advantages of hard decoding

  • Fast speed, reduce decoding time
  • Power consumption is greatly reduced, reducing the power consumption when using App

What are the input and output of VideoToolBox decoding?

  • Input NALU data, output CVPixelBufferRef

Hard decoding step

Train of thought

  • 1 Data parsing (NALU Unit) Determines I/P/B frames
  • 2 initialize the decoder session and configure parameters (unlike the encoder, the initialization of the decoder depends on SPS, PPS, so it must be after getting the data to initialize the decoder)
  • 3 Enter the parsed H264 NALU Unit into the decoder
  • 4 in the decoded callback function, return the data after the mediation code (can use OpenGL ES display)

The core function

  • 1 create decoding session, VTDecompressionSessionCreate
  • 2 decode a frame, VTDecompressionSessionDecodeFrame
  • 3 VTDecompressionSessionInvalidate destruction of decoding session

Related knowledge points

  • I frame key frame, retain a complete video frame, decode the key!

  • P frame forward reference frame, differential data, decoding needs to rely on I frame

  • B frame bidirectional reference frame, decoding needs to rely on both I frame and P frame

  • If an I frame error/loss occurs in the H264 stream, it will result in error transmission and P and B cannot be decoded independently. There will be a screen phenomenon

  • When hard coding with VideoToolBox, the first frame is not I, but manually added SPS/PPS

  • When decoding, the decoder needs to be initialized with SPS/PPS.

Receive and analyze data

  • Gets the data type, SPS/PPS, assigned to the member variable used to initialize the decoder
  • The IBP frame participates in decoding for the decoder
// Void decodeNaluData:(uint8_t *)naluData withSize:(uint32_t)frameSize {// 00 00 00 01* // The fifth bit identifies the data type, converted to decimal, 7 indicates SPS, 8 indicates PPS, 5 indicates I frame * int type = (naluData[4] & 0x1F); // Uint32_t naluSize = frameSize - 4; // Uint32_t naluSize = frameSize - 4; uint8_t *pNaluSize = (uint8_t *)(&naluSize); naluData[0] = *(pNaluSize + 3); naluData[1] = *(pNaluSize + 2); naluData[2] = *(pNaluSize + 1); naluData[3] = *(pNaluSize); CVPixelBufferRef pixelBuffer = NULL; Decode :(uint8_t *)frame SPS/PPS; / / Switch (type) {case 0x05: If ([self initDecoderSession]) {pixelBuffer = [self decode:naluData withSize:frameSize]; } break; Case 0x06: // Enhanced break; case 0x07: // sps _spsSize = naluSize; _sps = malloc(_spsSize); Memcpy (_sps, &naluData[4], _spsSize); break; case 0x08: // pps _ppsSize = naluSize; _pps = malloc(_ppsSize); // copy data memcpy(_pps, &naluData[4], _ppsSize) from subscript 4. break; If ([self initDecoderSession]) {pixelBuffer = [self decode:naluData withSize:frameSize]; } break; }}Copy the code

Initialize the decoder

/// initDecoderSession - (BOOL)initDecoderSession {if (self.decodeSession) return YES; const uint8_t * const parameterSetPointers[2] = {_sps, _pps}; const size_t parameterSetSizes[2] = {_spsSize, _ppsSize}; int naluHeaderLen = 4; /** Set decoding parameters param kCFAllocatorDefault allocator param 2 Number of parameters param parameterSetPointers param parameterSetSizes Param naluHeaderLen nalu Nalu Start Code Length 4 param _decodeDesc Decoder description Return Status */ OSStatus status = CMVideoFormatDescriptionCreateFromH264ParameterSets(kCFAllocatorDefault, 2, parameterSetPointers, parameterSetSizes, naluHeaderLen, &_videoDesc); if (status ! = noErr) { NSLog(@"Video hard DecodeSession create H264ParameterSets(sps, pps) failed status= %d", (int)status); return NO; } /** Decode parameters: * kCVPixelBufferPixelFormatTypeKey: camera kCVPixelBufferPixelFormatTypeKey output data format, Measured value available for kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, namely 420 v kCVPixelFormatType_420YpCbCr8BiPlanarFullRange, The 420F kCVPixelFormatType_32BGRA, iOS internally converts YUV to BGRA. YUV420 is generally used for STANDARD definition video, YUV422 is used for HD video, and this limitation is surprising. However, under the same conditions, YUV420 calculation time and transmission pressure are smaller than YUV422. * kCVPixelBufferWidthKey/kCVPixelBufferHeightKey: the resolution of the video source height width * * kCVPixelBufferOpenGLCompatibilityKey: It allows the decoded image to be drawn directly in the context of OpenGL, rather than copying data from the bus to the CPU. This is sometimes called the zero copy channel, because in the drawing process without decoding image is copied. * / NSDictionary * destinationPixBufferAttrs = @ {kCVPixelBufferPixelFormatTypeKey (id) : [NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarFullRange], *// On iOS nv12(UVUV configuration) instead of NV21 (VUVU configuration) * (id)kCVPixelBufferWidthKey: [NSNumber numberWithInteger:_config.width], (id)kCVPixelBufferHeightKey: [NSNumber numberWithInteger:_config.height], (id)kCVPixelBufferOpenGLCompatibilityKey: [NSNumber numberWithBool:true] }; / / decoding callback Settings / * * VTDecompressionOutputCallbackRecord is a simple structure, it comes with a pointer (decompressionOutputCallback), pointing to the frame decompression after the completion of the callback methods. You need to provide can find examples of the callback methods (decompressionOutputRefCon). VTDecompressionOutputCallback callback methods including seven parameters: 1: reference parameters of callback 2: the frame of reference parameters 3: a state identification (including undefined code) parameter 4: Indicates synchronous/asynchronous decoding, or decoder is going to throw the frame identification parameter 5: the actual image buffer parameters of six: the timestamp parameter 7: the duration of a * / VTDecompressionOutputCallbackRecord callbackRecord; callbackRecord.decompressionOutputCallback = videoDecoderCallBack; callbackRecord.decompressionOutputRefCon = (__bridge void * _Nullable)(self); / / create the session / * * @ function VTDecompressionSessionCreate @ the abstract create session for uncompressed video frames. The extracted @Discussion frame will issue a session of @Param Allocator memory by calling OutputCallback. By using the default kCFAllocatorDefault allocator. @ param videoFormatDescription described the source video frame @ param videoDecoderSpecification specified must use a specific video decoder. NULL @ param destinationImageBufferAttributes described the source pixel buffer requirement NULL @ param outputCallback use the decompression frame to invoke the callback @ param DecompressionSessionOut points to a variable to receive the new decompression session * / status = VTDecompressionSessionCreate (kCFAllocatorDefault _videoDesc, NULL, (__bridge CFDictionaryRef _Nullable)(destinationPixBufferAttrs), &callbackRecord, &_decodeSession); if (status ! = noErr) { NSLog(@"Video hard DecodeSession create failed status= %d", (int)status); return NO; Status = VTSessionSetProperty(self.decodesession, kVTDecompressionPropertyKey_RealTime,kCFBooleanTrue); NSLog(@"Vidoe hard decodeSession set property RealTime status = %d", (int)status); return YES; }Copy the code

Input the IBP frame into the decoder

(CVPixelBufferRef)decode:(uint8_t *)frame withSize:(uint32_t)frameSize {CVPixelBufferRef outputPixelBuffer = NULL; CMBlockBufferRef blockBuffer = NULL; CMBlockBufferFlags flag0 = 0; // Create blockBuffer* /*! Parameter 1: structureAllocator kCFAllocatorDefault parameter 2: memoryBlock frame parameter 3: Frame size parameter 4: blockAllocator: Pass NULL parameter 5: CustomBlockSource Pass NULL parameter 6: offsetToData data offset parameter 7: dataLength dataLength parameter 8: flags function and control flag parameter 9: NewBBufOut blockBuffer address, can't be empty * / OSStatus status = CMBlockBufferCreateWithMemoryBlock (kCFAllocatorDefault, frame, frameSize, kCFAllocatorNull, NULL, 0, frameSize, flag0, &blockBuffer); if (status ! = kCMBlockBufferNoErr) { NSLog(@"Video hard decode create blockBuffer error code=%d", (int)status); return outputPixelBuffer; } CMSampleBufferRef sampleBuffer = NULL; const size_t sampleSizeArray[] = {frameSize}; *// Create sampleBuffer* */** * parameter 1: allocator allocator, using the default memory allocation, kCFAllocatorDefault parameter 2: blockBuffer. Need encoded data blockBuffer. Cannot be NULL parameter 3: * * formatDescription, video output formats * * 4 parameters: numSamples. CMSampleBuffer number. * * 5 parameters: NumSampleTimingEntries must be 0,1,numSamples* * parameter 6: sampleTimingArray. Array. Empty * * Parameter 7: numSampleSizeEntries Default 1* * parameter 8: sampleSizeArray parameter 9: SampleBuffer object / * * * * status = CMSampleBufferCreateReady (kCFAllocatorDefault, blockBuffer _videoDesc, 1, 0, NULL, 1, sampleSizeArray, &sampleBuffer); if (status ! = noErr || ! sampleBuffer) { NSLog(@"Video hard decode create sampleBuffer failed status=%d", (int)status); CFRelease(blockBuffer); return outputPixelBuffer; VTDecodeFrameFlags flag1 = kVTDecodeFrame_1xRealTimePlayback; *// VTDecodeInfoFlags infoFlag = kVTDecodeInfo_Asynchronous; *// Decode data ** /** parameter 1: decode session* * parameter 2: source data containing one or more video frames CMsampleBuffer parameter 3: decode flag parameter 4: decoded data outputPixelBuffer parameter 5: Synchronous/asynchronous identifier / * * * * status = VTDecompressionSessionDecodeFrame (_decodeSession sampleBuffer, flag1, & outputPixelBuffer, &infoFlag); if (status == kVTInvalidSessionErr) { NSLog(@"Video hard decode InvalidSessionErr status =%d", (int)status); } else if (status == kVTVideoDecoderBadDataErr) { NSLog(@"Video hard decode BadData status =%d", (int)status); } else if (status ! = noErr) { NSLog(@"Video hard decode failed status =%d", (int)status); } CFRelease(sampleBuffer); CFRelease(blockBuffer); return outputPixelBuffer; }Copy the code

4. In the callback of VideoToolBox, the decoded data will be called back

void videoDecoderCallBack(void * CM_NULLABLE decompressionOutputRefCon, void * CM_NULLABLE sourceFrameRefCon, OSStatus status, VTDecodeInfoFlags infoFlags, CM_NULLABLE CVImageBufferRef imageBuffer, CMTime presentationTimeStamp, CMTime presentationDuration ) { if (status ! = noErr) { NSLog(@"Video hard decode callback error status=%d", (int)status); return; CVPixelBufferRef *outputPixelBuffer = (CVPixelBufferRef * CVPixelBufferRef) *)sourceFrameRefCon; *outputPixelBuffer = CVPixelBufferRetain(imageBuffer); / / get the self CQVideoDecoder * decoder = (__bridge CQVideoDecoder *) (decompressionOutputRefCon); / / callback dispatch_async (decoder. CallBackQueue, ^{ if (decoder.delegate && [decoder.delegate respondsToSelector:@selector(videoDecoder:didDecodeSuccessWithPixelBuffer:)]) { [decoder.delegate videoDecoder:decoder didDecodeSuccessWithPixelBuffer:imageBuffer]; } CVPixelBufferRelease(imageBuffer); }); }Copy the code