This example requirements: use H264, H265 video data encoding and recording the first 200 frames saved as a file.

Principle: do live function, for example, you need to the client’s video data to the server, if the resolution is too large such as 2 k, 4 k transmission pressure is too big, so need to encode video data, to the server after decoding to realize the large data quantity of video data transmission, while using hardware coding can maximum limit to reduce the pressure of the CPU,

H264 encoder. After iOS 11, iPhone 7 and higher can support the new encoder H265 encoder, which makes the same quality video takes up less storage space. So there are two ways to encode video data in this example


The final result looks like this: H264

h265 :


GitHub address (with code)H264,H265Encode

Letter Address:H264,H265Encode

Blog Address:H264,H265Encode

Address of nuggets:H264,H265Encode


Implementation method:

1. H264: H264 is the current mainstream coding standard, known for its high compression and high quality and support for streaming media transmission on various networks

2. H265: the next generation of H264 encoder, its main advantages provide high compression ratio, the same quality of video is twice as H264.


First, this article needs basic knowledge

Note: You can pass firstH264,H265 encoder introduction.H.264 Data StructureKnow the basics.

1. Concept of soft and hard knitting

  • Soft coding: Encoding using CPU.
  • Hard encoding: Use GPU, DSP, FPGA, ASIC chip and other hardware to encode instead of CPU.
    • To compare
      • Soft coding: direct, simple, easy to adjust parameters, easy to upgrade, but the CPU load, performance is lower than hard coding, low bit rate quality is usually better than hard coding.
      • With high performance and low bit rate, the quality is usually lower than that of soft encoder, but some products have transplanted excellent soft coding algorithm (such as X264) on GPU hardware platform, and the quality is basically the same as that of soft coding.
      • Before iOS 8.0, Apple did not have the open system hardware encoding and decoding function, but Mac OS has always had a framework called Video ToolBox to handle hardware encoding and decoding. Finally, after iOS 8.0, Apple introduced this framework to iOS system

2. H265 advantages

  • High compression ratio, twice as high as JPEG for the same image quality
  • Can add auxiliary images such as depth information, transparent channels, etc.
  • Support storage of multiple pictures, similar to albums and collections. (Achieve the effect of multiple exposures)
  • Support multiple images to achieve GIF and livePhoto animation effects.
  • There is no maximum pixel limit similar to JPEG
  • Transparent pixel support
  • Block loading mechanism
  • Thumbnail support

Ii. Code parsing

1. Implementation process

  • Initialize the camera parameters, set the camera agent, here fixed only portrait mode.
  • Initialize the encoder parameters and start the encoder
  • Start recording 200 frames (file size can be modified) of video in a successful encoding callback, save it to a sandbox, and extract the file (test0.asf) from iTunes by connecting the data cable to your computer

2. Encoder implementation process

  • Create session required for encoder (H264, H265 or both)
  • Set session properties, such as real-time encoding, bit rate, FPS, width and height of encoding resolution, maximum interval between i-frames, etc
    • Note that H265 does not currently support bitrate limits
  • When the camera callback AVCaptureVideoDataOutputSampleBufferDelegate is used when a frame data collected H264 / H265 encoder to encode each frame of data.
  • If the encoding succeeds, the callback will be triggered. The callback function will first detect whether there is an I frame, and if there is an I frame, the SPS and PPS information will be written; otherwise, the NALU code stream will be traversed and the startCode will be replaced with {0x00, 0x00, 0x00, 0x01}.

3. Analysis of main methods

  • To initialize the encoder, choose which method to implement first, EnableH264 = YES or [XDXHardwareEncoder getInstance]. EnableH265 = YES or both. If set at the same time, the writeFile method in one of the callbacks should be shielded, and only the newer iPhone(> iPhone8 stable) supports both sessions.

To determine whether the current device supports H265, ensure that the device version is later than iPhone 7 or iOS 11

if(@ the available (iOS 11.0. *)) {BOOL hardwareDecodeSupported = VTIsHardwareDecodeSupported (kCMVideoCodecType_HEVC);if (hardwareDecodeSupported) {
                _deviceSupportH265 = YES;
                NSLog(@"XDXHardwareEncoder : Support H265 Encode/Decode!"); }}else {
            _deviceSupportH265 = NO;
            NSLog(@"XDXHardwareEncoder : Not support H265 Encode/Decode!");
        }
Copy the code

VTIsHardwareDecodeSupported judge whether the current equipment support system has been H265 encoding

Initialize the encoder operation

- (void)prepareForEncode {
    if(self.width == 0 || self.height == 0) {
        NSLog(@"XDXHardwareEncoder : VTSession need with and height for init,with = %d,height = %d",self.width, self.height);
        return;
    }
    
    if(g_isSupportRealTimeEncoder)  NSLog(@"XDXHardwareEncoder : Device processor is 64 bit");
    else                            NSLog(@"XDXHardwareEncoder : Device processor is not 64 bit");
    
    NSLog(@"XDXHardwareEncoder : Current h264 open state : %d, h265 open state : %d",self.enableH264, self.enableH265);
    
    OSStatus h264Status,h265Status;
    BOOL isRestart = NO;
    if (self.enableH264) {
        if(h264CompressionSession ! = NULL) { NSLog(@"XDXHardwareEncoder : H264 session not NULL");
            return;
        }
        [m_h264_lock lock];
        NSLog(@"XDXHardwareEncoder : Prepare H264 hardware encoder");
        
        //[self.delegate willEncoderStart];
        
        self.h264ErrCount = 0;
        
        h264Status = VTCompressionSessionCreate(NULL, self.width, self.height, kCMVideoCodecType_H264, NULL, NULL, NULL, vtCallBack,(__bridge void *)self, &h264CompressionSession);
        if(h264Status ! = noErr) { self.h265ErrCount++; NSLog(@"XDXHardwareEncoder : H264 VTCompressionSessionCreate Failed, status = %d",h264Status);
        }
        [self getSupportedPropertyFlags];
        
        [self applyAllSessionProperty:h264CompressionSession propertyArr:self.h264propertyFlags];
        
        h264Status = VTCompressionSessionPrepareToEncodeFrames(h264CompressionSession);
        if(h264Status ! = noErr) { NSLog(@"XDXHardwareEncoder : H264 VTCompressionSessionPrepareToEncodeFrames Failed, status = %d",h264Status);
        }else {
            initializedH264     = true;
            NSLog(@"XDXHardwareEncoder : H264 VTSession create success, with = %d, height = %d, framerate = %d",self.width,self.height,self.fps);
        }
        if(h264Status ! = noErr && self.h264ErrCount ! = 0) isRestart = YES; [m_h264_lock unlock]; }if (self.enableH265) {
        if(h265CompressionSession ! = NULL) { NSLog(@"XDXHardwareEncoder : H265 session not NULL");
            return;
        }
        [m_h265_lock lock];
        NSLog(@"XDXHardwareEncoder : Prepare h265 hardware encoder");
        // [self.delegate willEncoderStart];
        
        self.h265ErrCount = 0;
        
        h265Status = VTCompressionSessionCreate(NULL, self.width, self.height, kCMVideoCodecType_HEVC, NULL, NULL, NULL, vtH265CallBack,(__bridge void *)self, &h265CompressionSession);
        if(h265Status ! = noErr) { self.h265ErrCount++; NSLog(@"XDXHardwareEncoder : H265 VTCompressionSessionCreate Failed, status = %d",h265Status);
        }
        
        [self getSupportedPropertyFlags];
        
        [self applyAllSessionProperty:h265CompressionSession propertyArr:self.h265PropertyFlags];
        
        h265Status = VTCompressionSessionPrepareToEncodeFrames(h265CompressionSession);
        if(h265Status ! = noErr) { NSLog(@"XDXHardwareEncoder : H265 VTCompressionSessionPrepareToEncodeFrames Failed, status = %d",h265Status);
        }else {
            initializedH265     = true;
            NSLog(@"XDXHardwareEncoder : H265 VTSession create success, with = %d, height = %d, framerate = %d",self.width,self.height,self.fps);
        }
        if(h265Status ! = noErr && self.h265ErrCount ! = 0) isRestart = YES; [m_h265_lock unlock]; }if (isRestart) {
        NSLog(@"XDXHardwareEncoder : VTSession create failured!");
            static int count = 0;
            count ++;
            if (count == 3) {
                NSLog(@TVUEncoder: restart 5 times Failured! exit!");
                return;
            }
            sleep(1);
            NSLog(@"TVUEncoder: try to restart after 1 second!");
            NSLog(@"TVUEncoder : vtsession error occured!,resetart encoder width: %d, height %d, times %d",self.width,self.height,count); [self tearDownSession]; [self prepareForEncode]; }}Copy the code

1> g_isSupportRealTimeEncoder = (is64Bit == 8) ? true : false; Used to determine whether the current device is 32-bit or 64-bit

2> create H264/H265Session the difference is only the parameter difference, H264 is kCMVideoCodecType_H264. H265 is kCMVideoCodecType_HEVC. After the Session is created, the corresponding callback function will be called when a frame is successfully encoded.

3 > by [self getSupportedPropertyFlags]; H265 does not support bitrate limits after testing. At present there is no solution. Wait for Apple’s follow-up.

After four > set related attributes, the encoder will be introduced below, set up after the completion of the call the VTCompressionSessionPrepareToEncodeFrames ready to code.

  • Set the attributes of the encoder
- (OSStatus)setSessionProperty:(VTCompressionSessionRef)session key:(CFStringRef)key value:(CFTypeRef)value {
    OSStatus status = VTSessionSetProperty(session, key, value);
    if(status ! = noErr) { NSString *sessionStr;if (session == h264CompressionSession) {
            sessionStr = @"h264 Session";
            self.h264ErrCount++;
        }else if (session == h265CompressionSession) {
            sessionStr = @"h265 Session";
            self.h265ErrCount++;
        }
        NSLog(@"XDXHardwareEncoder : Set %s of %s Failed, status = %d",CFStringGetCStringPtr(key, kCFStringEncodingUTF8),sessionStr.UTF8String,status);
    }
    return status;
}

- (void)applyAllSessionProperty:(VTCompressionSessionRef)session propertyArr:(NSArray *)propertyArr {
    OSStatus status;
    if(! g_isSupportRealTimeEncoder) { /* increase max frame delay from 3 to 6 to reduce encoder pressure*/ int value = 3; CFNumberRef ref = CFNumberCreate(NULL, kCFNumberSInt32Type, &value); [selfsetSessionProperty:session key:kVTCompressionPropertyKey_MaxFrameDelayCount value:ref];
        CFRelease(ref);
    }
    
    if(self.fps) {
        if([self isSupportPropertyWithKey:Key_ExpectedFrameRate inArray:propertyArr]) {
            int         value = self.fps;
            CFNumberRef ref   = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
            [self setSessionProperty:session key:kVTCompressionPropertyKey_ExpectedFrameRate value:ref]; CFRelease(ref); }}else {
        NSLog(@"XDXHardwareEncoder : Current fps is 0");
    }
    
    if(self.bitrate) {
        if([self isSupportPropertyWithKey:Key_AverageBitRate inArray:propertyArr]) {
            int value = self.bitrate;
            if (session == h265CompressionSession) value = 2*1000;  // if current session is h265, Set birate 2M.
            CFNumberRef ref = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
            [self setSessionProperty:session key:kVTCompressionPropertyKey_AverageBitRate value:ref]; CFRelease(ref); }}else {
        NSLog(@"XDXHardwareEncoder : Current bitrate is 0");
    }
    
    /*2016-11-15,@gang, iphone7/7plus do not support realtime encoding, so disable it
     otherwize ,we can not control encoding bit rate
     */
    if(! [[self deviceVersion] isEqualToString:@"IPhone9, 1"] && ![[self deviceVersion] isEqualToString:@"IPhone9, 2"]) {
        if(g_isSupportRealTimeEncoder) {
            if([self isSupportPropertyWithKey:Key_RealTime inArray:propertyArr]) {
                NSLog(@"use RealTimeEncoder");
                NSLog(@"XDXHardwareEncoder : use realTimeEncoder");
                [self setSessionProperty:session key:kVTCompressionPropertyKey_RealTime value:kCFBooleanTrue]; }}}if([self isSupportPropertyWithKey:Key_AllowFrameReordering inArray:propertyArr]) {
        [self setSessionProperty:session key:kVTCompressionPropertyKey_AllowFrameReordering value:kCFBooleanFalse];
    }
    
    if(g_isSupportRealTimeEncoder) {
        if([self isSupportPropertyWithKey:Key_ProfileLevel inArray:propertyArr]) {
            [self setSessionProperty:session key:kVTCompressionPropertyKey_ProfileLevel value:self.enableH264 ? kVTProfileLevel_H264_Main_AutoLevel : kVTProfileLevel_HEVC_Main_AutoLevel]; }}else {
        if([self isSupportPropertyWithKey:Key_ProfileLevel inArray:propertyArr]) {
            [self setSessionProperty:session key:kVTCompressionPropertyKey_ProfileLevel value:self.enableH264 ? kVTProfileLevel_H264_Baseline_AutoLevel : kVTProfileLevel_HEVC_Main_AutoLevel];
        }
        
        if (self.enableH264) {
            if([self isSupportPropertyWithKey:Key_H264EntropyMode inArray:propertyArr]) {
                [self setSessionProperty:session key:kVTCompressionPropertyKey_H264EntropyMode value:kVTH264EntropyMode_CAVLC]; }}}if([self isSupportPropertyWithKey:Key_MaxKeyFrameIntervalDuration inArray:propertyArr]) {
        int         value   = 1;
        CFNumberRef ref     = CFNumberCreate(NULL, kCFNumberSInt32Type, &value);
        [self setSessionProperty:session key:kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration value:ref]; CFRelease(ref); }}Copy the code

The above method mainly sets the parameters required to start the encoder

1 > kVTCompressionPropertyKey_MaxFrameDelayCount: compressor was allowed to keep the maximum number of frames before the output a compressed frame. For example, if the maximum frame delay is M, then frame N-m must be ejected before the call encoding frame N returns.

2 > kVTCompressionPropertyKey_ExpectedFrameRate: set the FPS

3 > kVTCompressionPropertyKey_AverageBitRate: it is not mandatory restrictions, bit rate may exceed peak

4 > kVTCompressionPropertyKey_RealTime: set the encoder is real-time encoding, if set to False is not real-time coding, video effect will be better.

5 > kVTCompressionPropertyKey_AllowFrameReordering: whether the frame to reorder. To encode B frames, the encoder must reorder the frames, which will mean that the order of decoding is different from the order of display. Set it to false to prevent frame reordering.

6 > kVTCompressionPropertyKey_ProfileLevel: specify the coded bit stream profile and level

7 > kVTCompressionPropertyKey_H264EntropyMode: if support h264 should use the attribute set encoder based on CAVLC CABAC

8 > kVTCompressionPropertyKey_MaxKeyFrameIntervalDuration: maximum duration between the two I frame, this property is especially useful when the frame rate is variable

  • Each frame of data is encoded in the camera callback
- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection {
    if( !CMSampleBufferDataIsReady(sampleBuffer)) {
        NSLog( @"sample buffer is not ready. Skipping sample" );
        return;
    }
    
    if([XDXHardwareEncoder getInstance] != NULL) {
        [[XDXHardwareEncoder getInstance] encode:sampleBuffer];
    }
}
Copy the code

The above method will be called once after each frame of video data is collected, and we will encode each frame of data we get.

  • Coding concrete implementation
-(void)encode:(CMSampleBufferRef)sampleBuffer {
    if (self.enableH264) {
        [m_h264_lock lock];
        if(h264CompressionSession == NULL) {
            [m_h264_lock unlock];
            return;
        }
        
        if(initializedH264 == false) {
            NSLog(@"TVUEncoder : h264 encoder is not ready\n");
            return; }}if (self.enableH265) {
        [m_h265_lock lock];
        if(h265CompressionSession == NULL) {
            [m_h265_lock unlock];
            return;
        }
        
        if(initializedH265 == false) {
            NSLog(@"TVUEncoder : h265 encoder is not ready\n");
            return;
        }
    }
    
    CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    CMTime duration = CMSampleBufferGetOutputDuration(sampleBuffer);
    frameID++;
    CMTime presentationTimeStamp = CMTimeMake(frameID, 1000);
    

    
    [self doSetBitrate];
    
    OSStatus status;
    VTEncodeInfoFlags flags;
    if (self.enableH264) {
        status = VTCompressionSessionEncodeFrame(h264CompressionSession, imageBuffer, presentationTimeStamp, duration, NULL, imageBuffer, &flags);
        if(status ! = noErr) NSLog(@"TVUEncoder : H264 VTCompressionSessionEncodeFrame failed");
        [m_h264_lock unlock];
        
        if(status ! = noErr) { NSLog(@"TVUEncoder : VTCompressionSessionEncodeFrame failed");
            VTCompressionSessionCompleteFrames(h264CompressionSession, kCMTimeInvalid);
            VTCompressionSessionInvalidate(h264CompressionSession);
            CFRelease(h264CompressionSession);
            h264CompressionSession = NULL;
        }else {
            // NSLog(@"TVUEncoder : Success VTCompressionSessionCompleteFrames"); }}if (self.enableH265) {
        status = VTCompressionSessionEncodeFrame(h265CompressionSession, imageBuffer, presentationTimeStamp, duration, NULL, imageBuffer, &flags);
        if(status ! = noErr) NSLog(@"TVUEncoder : H265 VTCompressionSessionEncodeFrame failed");
        [m_h265_lock unlock];
        
        if(status ! = noErr) { NSLog(@"TVUEncoder : VTCompressionSessionEncodeFrame failed");
            VTCompressionSessionCompleteFrames(h265CompressionSession, kCMTimeInvalid);
            VTCompressionSessionInvalidate(h265CompressionSession);
            CFRelease(h265CompressionSession);
            h265CompressionSession = NULL;
        }else {
            NSLog(@"TVUEncoder : Success VTCompressionSessionCompleteFrames"); }}}Copy the code

1> Time stamps are constructed by incrementing the frameID so that each encoded frame is continuous

2> Set the maximum bit rate limit. Note: H265 currently does not support setting the bit rate limit. Wait for official notification. You can limit the bit rate of H264

3 > kVTCompressionPropertyKey_DataRateLimits: the data bytes and the duration of packaging to CFMutableArrayRef to API call

4 > VTCompressionSessionEncodeFrame: this method is called after a successful triggered the callback function complete code.

  • The header information is handled in the callback function
#pragma mark H264 Callback
static void vtCallBack(void *outputCallbackRefCon,void *souceFrameRefCon,OSStatus status,VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) {
    XDXHardwareEncoder *encoder = (__bridge XDXHardwareEncoder*)outputCallbackRefCon;
    if(status ! = noErr) { NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil]; NSLog(@"H264: vtCallBack failed with %@", error);
        NSLog(@"XDXHardwareEncoder : encode frame failured! %s" ,error.debugDescription.UTF8String);
        return;
    }
    
    if(! CMSampleBufferDataIsReady(sampleBuffer)) { NSLog(@"didCompressH265 data is not ready ");
        return;
    }
    if (infoFlags == kVTEncodeInfo_FrameDropped) {
        NSLog(@"%s with frame dropped.", __FUNCTION__);
        return;
    }
    
    CMBlockBufferRef block = CMSampleBufferGetDataBuffer(sampleBuffer);
    BOOL isKeyframe = false;

    CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false);

    if(attachments ! = NULL) { CFDictionaryRef attachment =(CFDictionaryRef)CFArrayGetValueAtIndex(attachments, 0); CFBooleanRef dependsOnOthers = (CFBooleanRef)CFDictionaryGetValue(attachment, kCMSampleAttachmentKey_DependsOnOthers); isKeyframe = (dependsOnOthers == kCFBooleanFalse); }if(isKeyframe) {
        CMFormatDescriptionRef format = CMSampleBufferGetFormatDescription(sampleBuffer);
        static uint8_t *spsppsNALBuff = NULL;
        static size_t  spsSize, ppsSize;

            size_t parmCount;
            const uint8_t*sps, *pps;
            int NALUnitHeaderLengthOut;
            CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 0, &sps, &spsSize, &parmCount, &NALUnitHeaderLengthOut );
            CMVideoFormatDescriptionGetH264ParameterSetAtIndex(format, 1, &pps, &ppsSize, &parmCount, &NALUnitHeaderLengthOut );

            spsppsNALBuff = (uint8_t*)malloc(spsSize+4+ppsSize+4);
            memcpy(spsppsNALBuff, "\x00\x00\x00\x01", 4);
            memcpy(&spsppsNALBuff[4], sps, spsSize);
            memcpy(&spsppsNALBuff[4+spsSize], "\x00\x00\x00\x01", 4);
            memcpy(&spsppsNALBuff[4+spsSize+4], pps, ppsSize);
            NSLog(@"XDXHardwareEncoder : H264 spsSize : %zu, ppsSize : %zu",spsSize, ppsSize);
         writeFile(spsppsNALBuff,spsSize+4+ppsSize+4,encoder->_videoFile, 200);
    }

    size_t blockBufferLength;
    uint8_t *bufferDataPointer = NULL;
    CMBlockBufferGetDataPointer(block, 0, NULL, &blockBufferLength, (char **)&bufferDataPointer);

    size_t bufferOffset = 0;
    while (bufferOffset < blockBufferLength - startCodeLength) {
        uint32_t NALUnitLength = 0;
        memcpy(&NALUnitLength, bufferDataPointer+bufferOffset, startCodeLength);
        NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
        memcpy(bufferDataPointer+bufferOffset, startCode, startCodeLength);
        bufferOffset += startCodeLength + NALUnitLength;
    }
    writeFile(bufferDataPointer, blockBufferLength,encoder->_videoFile, 200);
}

#pragma mark H265 Callback
static void vtH265CallBack(void *outputCallbackRefCon,void *souceFrameRefCon,OSStatus status,VTEncodeInfoFlags infoFlags, CMSampleBufferRef sampleBuffer) {
    XDXHardwareEncoder *encoder = (__bridge XDXHardwareEncoder*)outputCallbackRefCon;
    if(status ! = noErr) { NSError *error = [NSError errorWithDomain:NSOSStatusErrorDomain code:status userInfo:nil]; NSLog(@"H264: H265 vtH265CallBack failed with %@", error);
        NSLog(@"XDXHardwareEncoder : H265 encode frame failured! %s" ,error.debugDescription.UTF8String);
        return;
    }
    
    if(! CMSampleBufferDataIsReady(sampleBuffer)) { NSLog(@"didCompressH265 data is not ready ");
        return;
    }
    if (infoFlags == kVTEncodeInfo_FrameDropped) {
        NSLog(@"%s with frame dropped.", __FUNCTION__);
        return;
    }

    CMBlockBufferRef block = CMSampleBufferGetDataBuffer(sampleBuffer);
    BOOL isKeyframe = false;

    CFArrayRef attachments = CMSampleBufferGetSampleAttachmentsArray(sampleBuffer, false);

    if(attachments ! = NULL) { CFDictionaryRef attachment =(CFDictionaryRef)CFArrayGetValueAtIndex(attachments, 0); CFBooleanRef dependsOnOthers = (CFBooleanRef)CFDictionaryGetValue(attachment, kCMSampleAttachmentKey_DependsOnOthers); isKeyframe = (dependsOnOthers == kCFBooleanFalse); }if(isKeyframe) {
        CMFormatDescriptionRef format     = CMSampleBufferGetFormatDescription(sampleBuffer);
        static uint8_t *vpsspsppsNALBuff  = NULL;
        static size_t  vpsSize, spsSize, ppsSize;
            size_t parmCount;
            const uint8_t *vps, *sps, *pps;

            if(encoder.deviceSupportH265) { // >= iPhone 7 && support ios11 CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format,  0, &vps, &vpsSize, &parmCount, 0); CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format, 1, &sps, &spsSize, &parmCount, 0); CMVideoFormatDescriptionGetHEVCParameterSetAtIndex(format, 2, &pps, &ppsSize, &parmCount, 0); vpsspsppsNALBuff = (uint8_t*)malloc(vpsSize+4+spsSize+4+ppsSize+4); memcpy(vpsspsppsNALBuff,"\x00\x00\x00\x01", 4);
                memcpy(&vpsspsppsNALBuff[4], vps, vpsSize);
                memcpy(&vpsspsppsNALBuff[4+vpsSize], "\x00\x00\x00\x01", 4);
                memcpy(&vpsspsppsNALBuff[4+vpsSize+4], sps, spsSize);
                memcpy(&vpsspsppsNALBuff[4+vpsSize+4+spsSize], "\x00\x00\x00\x01", 4);
                memcpy(&vpsspsppsNALBuff[4+vpsSize+4+spsSize+4], pps, ppsSize);
                NSLog(@"XDXHardwareEncoder : H265 vpsSize : %zu, spsSize : %zu, ppsSize : %zu",vpsSize,spsSize, ppsSize);
            }
             writeFile(vpsspsppsNALBuff, vpsSize+4+spsSize+4+ppsSize+4,encoder->_videoFile, 200);
    }

    size_t   blockBufferLength;
    uint8_t  *bufferDataPointer = NULL;
    CMBlockBufferGetDataPointer(block, 0, NULL, &blockBufferLength, (char **)&bufferDataPointer);

    size_t bufferOffset = 0;
    while (bufferOffset < blockBufferLength - startCodeLength) {
        uint32_t NALUnitLength = 0;
        memcpy(&NALUnitLength, bufferDataPointer+bufferOffset, startCodeLength);
        NALUnitLength = CFSwapInt32BigToHost(NALUnitLength);
        memcpy(bufferDataPointer+bufferOffset, startCode, startCodeLength);
        bufferOffset += startCodeLength + NALUnitLength;
    }

     writeFile(bufferDataPointer, blockBufferLength,encoder->_videoFile, 200);
}
Copy the code

1> First cut off the I frame in the callback function, extract from the I frame (h265 added VPS), SPS, PPS information and write file 2> traversal other frames 00000001 write header information to each header, and then write the data to the file

Introduction to code stream data structure

Here we briefly introduce H264,H265 bit stream information

  1. H264 stream data consists of a series of NAL units.

  2. An NALU may contain: video frames, video frames are video clips, specifically I,P, and B frames

  1. H.264 Attribute Collection -FormatDesc(including SPS and PPS)

Note that VPS is added first in H265 stream data.

  • H.264 Attribute Collection -FormatDesc(including SPS and PPS)

In the stream data, the set of attributes might look like this:

After processing, in Format Description:

  • NALU Header For stream data, an NALU header may start with either 0x00 00 01 or 0x00 00 00 01 (both are possible, using 0x00 00 01 as an example below). 0x00 00 01 is therefore called the Start code. So we need to replace the data content with 0x00 00 00 01 in the extracted data

To sum up the above knowledge, we know that the code stream of H264 consists of NALU unit, which contains video image data and parameter information of H264. The video image data is the CMBlockBuffer, and the H264 parameter information can be combined as FormatDesc. Specifically, the Parameter information includes SPS (Sequence Parameter Set) and PPS (Picture Parameter Set). The following figure shows an H.264 stream structure:

  • Extract SPS and PPS to generate FormatDesc

    • The start code of each NALU is 0x00 00 01. Locate the NALU according to the start code
    • SPS and PPS are found through the type information and extracted. The last 5 bits of the first byte after the start code, 7 represents SPS and 8 represents PPS
    • Using CMVideoFormatDescriptionCreateFromH264ParameterSets function to construct CMVideoFormatDescriptionRef
  • Extract video image data to generate CMBlockBuffer

    • Locate to NALU by start code
    • Once the type is determined to be data, replace the start code with the length information of the NALU (4 Bytes)
    • Using CMBlockBufferCreateWithMemoryBlock CMBlockBufferRef interface structure
  • Generate CMTime information as required. (In the actual test, there were unstable images after time information was added, but there was no image without time information, which requires further study. It is suggested not to add time information here.)

According to the above, CMBlockBufferRef CMVideoFormatDescriptionRef, and optional time information, using the data obtained CMSampleBuffer CMSampleBufferCreate interface this to decode the original data. The diagram of H264 data conversion is shown below.

Encoder knowledge can refer to:H264,H265 encoder introduction