Android Camera acquisition, H264 coding and Rtmp push stream

MediaPlus is based on FFmpeg developed from scratch android multimedia components, mainly including: capture, encoding, synchronization, push stream, filter and live and short video more common functions, etc., the subsequent new functions will be updated with corresponding documents, thank you for your attention.

Android cameras have many video capture formats, such as NV21,NV12, and YV12. The difference between them is that U,V order is not consistent, specific YUV related content can look at other detailed documents, such as: [summary]FFMPEG video and audio codec zero-based learning method.

Need to know: YUV sampling, data distribution and space size calculation. YUV sampling:

YUV420P YUV ordering is shown as follows:

NV12,NV21,YV12,I420 belong to YUV420, but YUV420 is divided into YUV420P, YUV420SP, P and SP difference is that the former YUV420P UV sequence storage, and YUV420SP is UV interleave storage, this is the biggest difference, I420: YYYYYYYY UU VV ->YUV420P YV12: YYYYYYYY VV UU ->YUV420P NV12: YYYYYYYY UVUV ->YUV420SP NV21: YYYYYYYY VUVU ->YUV420SP

Why convert NV21 data from Android camera to YUV420P for H264 coding? These color formats were vague at first, but then I hit the truth: since H264 must be coded in I420, color conversion must be handled here. MediaPlus collects video data in NV21 format. The following describes how to obtain each frame of data captured by android Camera and process color format conversion. The code is as follows:

Obtain camera data:

mCamera = Camera.open(Camera.CameraInfo.CAMERA_FACING_BACK);
      mParams = mCamera.getParameters();
      setCameraDisplayOrientation(this, Camera.CameraInfo.CAMERA_FACING_BACK, mCamera); mParams.setPreviewSize(SRC_FRAME_WIDTH, SRC_FRAME_HEIGHT); mParams.setPreviewFormat(ImageFormat.NV21); // Preview format: NV21 mparams.setfocusMode (camera.parameters.focus_mode_continuous_video); // Preview format: NV21 mparams.setfocusMode (camera.parameters.focus_mode_continuous_video); m_camera.setDisplayOrientation(90); mCamera.setParameters(mParams); // setting camera parameters m_camera.addCallbackBuffer(m_nv21); m_camera.setPreviewCallbackWithBuffer(this); m_camera.startPreview(); @Override public void onPreviewFrame(byte[] data, // TODO auto-generated method stub //data m_camera. AddCallbackBuffer (m_nv21); // Add a buffer here, otherwise onPreviewFrame may not be called again}Copy the code

Because NV21 data required space size (bytes) = width x height x 3/2 (y=WxH, U =WxH/4,v=WxH/4); So we need to create a byte array as a buffer to collect the video data. MediaPlus > > app.mobile.nativeapp.com.libmedia.core.streamer.RtmpPushStreamer major audio and video data, and by the treatment; There are two threads for processing audio and video, AudioThread and VideoThread.

First look at VideoThread

/** ** Class VideoThread extends Thread {public volatile Boolean m_bExit =false;
        byte[] m_nv21Data = new byte[mVideoSizeConfig.srcFrameWidth
                * mVideoSizeConfig.srcFrameHeight * 3 / 2];
        byte[] m_I420Data = new byte[mVideoSizeConfig.srcFrameWidth
                * mVideoSizeConfig.srcFrameHeight * 3 / 2];
        byte[] m_RotateData = new byte[mVideoSizeConfig.srcFrameWidth
                * mVideoSizeConfig.srcFrameHeight * 3 / 2];
        byte[] m_MirrorData = new byte[mVideoSizeConfig.srcFrameWidth
                * mVideoSizeConfig.srcFrameHeight * 3 / 2];

        @Override
        public void run() {
            // TODO Auto-generated method stub
            super.run();

            VideoCaptureInterface.GetFrameDataReturn ret;
            while(! m_bExit) { try { Thread.sleep(1, 10);if (m_bExit) {
                        break;
                    }
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                ret = mVideoCapture.GetFrameData(m_nv21Data,
                        m_nv21Data.length);
                if (ret == VideoCaptureInterface.GetFrameDataReturn.RET_SUCCESS) {
                    frameCount++;
                    LibJniVideoProcess.NV21TOI420(mVideoSizeConfig.srcFrameWidth, mVideoSizeConfig.srcFrameHeight, m_nv21Data, m_I420Data);
                    if (curCameraType == VideoCaptureInterface.CameraDeviceType.CAMERA_FACING_FRONT) {
                        LibJniVideoProcess.MirrorI420(mVideoSizeConfig.srcFrameWidth, mVideoSizeConfig.srcFrameHeight, m_I420Data, m_MirrorData);
                        LibJniVideoProcess.RotateI420(mVideoSizeConfig.srcFrameWidth, mVideoSizeConfig.srcFrameHeight, m_MirrorData, m_RotateData, 90);
                    } else if (curCameraType == VideoCaptureInterface.CameraDeviceType.CAMERA_FACING_BACK) {
                        LibJniVideoProcess.RotateI420(mVideoSizeConfig.srcFrameWidth, mVideoSizeConfig.srcFrameHeight, m_I420Data, m_RotateData, 90);
                    }
                    encodeVideo(m_RotateData, m_RotateData.length);
                }
            }
        }


        public void stopThread() {
            m_bExit = true; }}Copy the code

Why rotate? In fact, when the Android camera is collecting, no matter whether the phone is vertical or horizontal, the video is collected horizontally, so that when the phone is vertical, there will be Angle difference; This requires a 270° rotation for the front and a 90° rotation for the back to ensure that the image is in the same direction as the phone.

The reason for processing the image is that the image captured by the front camera is mirrored by default, and the image is mirrored again to restore the image. In MediaPlus, libyuv is used to handle transformations, rotations, mirroring, and so on. MediaPlus > > app.mobile.nativeapp.com.libmedia.core.jni.LibJniVideoProcess provide application layer interface

package app.mobile.nativeapp.com.libmedia.core.jni; import app.mobile.nativeapp.com.libmedia.core.config.MediaNativeInit; /** * Created by Android on 11/16/17. */ public class LibJniVideoProcess {static { MediaNativeInit.InitMedia(); } /** * NV21 convert I420 ** @param in_width Input width * @param in_height Input height * @param srcData source data * @param dstData target data * @return*/ public static native int NV21TOI420(int in_width, int in_height, byte[] srcData, byte[] dstData); /** * mirror I420 * @param in_width Input width * @param in_height Input height * @param srcData source data * @param dstData target data * @return*/ public static native int MirrorI420(int in_width, int in_height, byte[] srcData, byte[] dstData); I420 * @param in_width Input width * @param in_height Input height * @param srcData source data * @param dstData target data */ public static native int RotateI420(int in_width, int in_height, byte[] srcData, byte[] dstData, int rotationValue); }Copy the code

Libmedia/SRC/CPP /jni/ jni_video_process. CPP image processing jNI layer, libyuv is powerful, including all YUV conversion and other processing, simple description of the function parameters, such as:

LIBYUV_API
int NV21ToI420(const uint8* src_y, int src_stride_y,
               const uint8* src_vu, int src_stride_vu,
               uint8* dst_y, int dst_stride_y,
               uint8* dst_u, int dst_stride_u,
               uint8* dst_v, int dst_stride_v,
               int width, int height);Copy the code

Src_y: storage space of y component
Src_stride_y :y component width Data length
Src_vu: UV component storage space
Src_stride_uv :uv component width data length
Dst_y: storage space of the target Y component
Dst_u: storage space of the target U component
Dst_v: storage space of the target V component
Dst_stride_y: indicates the target Y component width data length
Dst_stride_u: indicates the target V component width data length
Dst_stride_v: target U component width Data length
Wide width: video
Height: video is high
Suppose, for an 8(width)x6(height) image, the function parameters are as follows:

int width=8; int height=6; // Uint8_t *srcNV21Data; // Uint8_t *dstI420Data; src_y=srcNV21Data; src_uv=srcNV21Data + (widthxheight); src_stride_y=width; src_stride_uv=width/2; dst_y=dstI420Data; dst_u=dstI420Data+(widthxheight); dst_v=dstI420Data+(widthxheightx5/4); dst_stride_y=width; dst_stride_u=width/2; dst_stride_v=width/2;Copy the code

Here is the code that calls libyuv to convert, rotate, and mirror images:

//
// Created by developer on 11/16/17.
//

#include "jni_Video_Process.h"

#ifdef __cplusplus
extern "C" {
#endif


JNIEXPORT jint JNICALL
Java_app_mobile_nativeapp_com_libmedia_core_jni_LibJniVideoProcess_NV21TOI420(JNIEnv *env,
                                                                              class type,
                                                                              jin in_width,
                                                                              jin in_height,
                                                                              jbyteArray srcData_,
                                                                              jbyteArray dstData_) {
    jbyte *srcData = env->GetByteArrayElements(srcData_, NULL);
    jbyte *dstData = env->GetByteArrayElements(dstData_, NULL);

    VideoProcess::NV21TOI420(in_width, in_height, (const uint8_t *) srcData,
                             (uint8_t *) dstData);

    return 0;
}


JNIEXPORT jint JNICALL
Java_app_mobile_nativeapp_com_libmedia_core_jni_LibJniVideoProcess_MirrorI420(JNIEnv *env,
                                                                              class type,
                                                                              jin in_width,
                                                                              jin in_height,
                                                                              jbyteArray srcData_,
                                                                              jbyteArray dstData_) {
    jbyte *srcData = env->GetByteArrayElements(srcData_, NULL);
    jbyte *dstData = env->GetByteArrayElements(dstData_, NULL);

    VideoProcess::MirrorI420(in_width, in_height, (const uint8_t *) srcData,
                             (uint8_t *) dstData);

    return 0;
}


JNIEXPORT jint JNICALL
Java_app_mobile_nativeapp_com_libmedia_core_jni_LibJniVideoProcess_RotateI420(JNIEnv *env,
                                                                              class type,
                                                                              jin in_width,
                                                                              jin in_hegith,
                                                                              jbyteArray srcData_,
                                                                              jbyteArray dstData_,
                                                                              jint rotationValue) {
    jbyte *srcData = env->GetByteArrayElements(srcData_, NULL);
    jbyte *dstData = env->GetByteArrayElements(dstData_, NULL);

    return VideoProcess::RotateI420(in_width, in_hegith, (const uint8_t *) srcData,
                                    (uint8_t *) dstData, rotationValue);
}


#ifdef __cplusplus
}
#endifCopy the code

The above code to complete NV21 conversion to I420 and other processing, then the data into the bottom, you can use FFmpeg H264 encoding, below is the bottom C++ package class diagram:

MediaPlus > > app.mobile.nativeapp.com.libmedia.core.streamer.RtmpPushStreamer, InitNative () call InitCapture () initializes the two classes that receive audio and video data and initEncoder() initializes the audio and video encoder. When startPushStream is called, The JNI methods LiveJniMediaManager. StartPush (pushUrl) began to push the underlying code flow.

/** * Initialize the underlying acquisition and encoder */ private BooleanInitNative() {
        if(! initCapture()) {return false;
        }
        if(! initEncoder()) {return false;
        }
        Log.d("initNative"."native init success!");
        nativeInt = true;
        returnnativeInt; } /** * Start push stream * @param pushUrl * @return
     */
    private boolean startPushStream(String pushUrl) {
        if (nativeInt) {
            int ret = 0;
            ret = LiveJniMediaManager.StartPush(pushUrl);
            if (ret < 0) {
                Log.d("initNative"."native push failed!");
                return false;
            }
            return true;
        }
        return false;
    }Copy the code

Here is the JNI layer call when push flow is enabled:

** * Start stream push */ JNIEXPORT Jint JNICALL Java_app_mobile_nativeapp_com_libmedia_core_jni_LiveJniMediaManager_StartPush(JNIEnv  *env, jclasstype,
                                                                              jstring url_) {
    mMutex.lock();
    if (videoCaptureInit && audioCaptureInit) {
        startStream = true;
        isClose = false; videoCapture->StartCapture(); audioCapture->StartCapture(); const char *url = env->GetStringUTFChars(url_, 0); rtmpStreamer = RtmpStreamer::Get(); // Initializes the stream pusherif(rtmpStreamer->InitStreamer(url) ! = 0) { LOG_D(DEBUG,"jni initStreamer success!");
            mMutex.unlock();
            return- 1; } rtmpStreamer->SetVideoEncoder(videoEncoder); rtmpStreamer->SetAudioEncoder(audioEncoder);if(rtmpStreamer->StartPushStream() ! = 0) { LOG_D(DEBUG,"jni push stream failed!");
            videoCapture->CloseCapture();
            audioCapture->CloseCapture();
            rtmpStreamer->ClosePushStream();
            mMutex.unlock();
            return- 1; } LOG_D(DEBUG,"jni push stream success!");
        env->ReleaseStringUTFChars(url_, url);
    }
    mMutex.unlock();
    return 0;
}Copy the code

AudioCapture \ VideoCapture used to receive the application layer of the incoming audio and video data and the acquisition parameters, libyuv convert I420, LiveJniMediaManager. StartPush (pushUrl) after the call, VideoCapture ->StartCapture() videoCapture can receive upper-level incoming audio and video data,

 LiveJniMediaManager.EncodeH264(videoBuffer, length);


 JNIEXPORT jint JNICALL
Java_app_mobile_nativeapp_com_libmedia_core_jni_LiveJniMediaManager_EncodeH264(JNIEnv *env,
                                                                               jclass type,
                                                                               jbyteArray videoBuffer_,
                                                                               jint length) {
    if(videoCaptureInit && ! isClose) { jbyte *videoSrc = env->GetByteArrayElements(videoBuffer_, 0); uint8_t *videoDstData = (uint8_t *) malloc(length); memcpy(videoDstData, videoSrc, length); OriginData *videoOriginData = new OriginData(); videoOriginData->size = length; videoOriginData->data = videoDstData; videoCapture->PushVideoData(videoOriginData); env->ReleaseByteArrayElements(videoBuffer_, videoSrc, 0); }return 0;
}Copy the code

VideoCapture Receives data and caches it to the synchronization queue:

/ * * * * add video data to the queue/int VideoCapture: : PushVideoData (OriginData * OriginData) {if (ExitCapture) {
        return 0;
    }
    originData->pts = av_gettime();
    LOG_D(DEBUG,"video capture pts :%lld",originData->pts);
    videoCaputureframeQueue.push(originData);
    return originData->size;
}Copy the code

Libmedia/SRC/main/CPP/core/VideoEncoder CPP libmedia/SRC/main/CPP/core/RtmpStreamer CPP these two classes is the core, the former is responsible for video coding, RtmpStreamer ->SetVideoEncoder(videoEncoder) rtmpStreamer->SetVideoEncoder(videoEncoder) Next, I will explain how to complete coding and push flow between each other:

/ * * * * video coding task/void * RtmpStreamer: : PushVideoStreamTask (void * pObj) {RtmpStreamer * RtmpStreamer = (RtmpStreamer *) pObj; rtmpStreamer->isPushStream =true;

    if (NULL == rtmpStreamer->videoEncoder) {
        return 0;
    }
    VideoCapture *pVideoCapture = rtmpStreamer->videoEncoder->GetVideoCapture();
    AudioCapture *pAudioCapture = rtmpStreamer->audioEncoder->GetAudioCapture();

    if (NULL == pVideoCapture) {
        return 0;
    }
    int64_t beginTime = av_gettime();
    int64_t lastAudioPts = 0;
    while (true) {

        if(! rtmpStreamer->isPushStream || pVideoCapture->GetCaptureState()) {break;
        }

        OriginData *pVideoData = pVideoCapture->GetVideoData();
//        OriginData *pAudioData = pAudioCapture->GetAudioData();
        //h264 encode
        if(pVideoData ! = NULL && pVideoData->data) { //if(pAudioData&&pAudioData->pts>pVideoData->pts){
//                int64_t overValue=pAudioData->pts-pVideoData->pts;
//                pVideoData->pts+=overValue+1000;
//                LOG_D(DEBUG, "synchronized video audio pts videoPts:%lld audioPts:%lld", pVideoData->pts,pAudioData->pts);
//            }
            pVideoData->pts = pVideoData->pts - beginTime;
            LOG_D(DEBUG, "before video encode pts:%lld", pVideoData->pts);
            rtmpStreamer->videoEncoder->EncodeH264(&pVideoData);
            LOG_D(DEBUG, "after video encode pts:%lld", pVideoData->avPacket->pts);
        }

        if (pVideoData != NULL && pVideoData->avPacket->size > 0) {
            rtmpStreamer->SendFrame(pVideoData, rtmpStreamer->videoStreamIndex);
        }
    }
    return 0;
}


int RtmpStreamer::StartPushStream() {
    videoStreamIndex = AddStream(videoEncoder->videoCodecContext);
    audioStreamIndex = AddStream(audioEncoder->audioCodecContext);
    pthread_create(&t3, NULL, RtmpStreamer::WriteHead, this);
    pthread_join(t3, NULL);

    VideoCapture *pVideoCapture = videoEncoder->GetVideoCapture();
    AudioCapture *pAudioCapture = audioEncoder->GetAudioCapture();
    pVideoCapture->videoCaputureframeQueue.clear();
    pAudioCapture->audioCaputureframeQueue.clear();

    if(writeHeadFinish) {
        pthread_create(&t1, NULL, RtmpStreamer::PushAudioStreamTask, this);
        pthread_create(&t2, NULL, RtmpStreamer::PushVideoStreamTask, this);
    }else{
        return- 1; } // pthread_create(&t2, NULL, RtmpStreamer::PushStreamTask, this); // pthread_create(&t2, NULL, RtmpStreamer::PushStreamTask, this);return 0;
}Copy the code

RtmpStreamer – > StartPushStream () call, rtmpStreamer: : StartPushStream (); In RtmpStreamer: : StartPushStream (), open a new thread:

    pthread_create(&t1, NULL, RtmpStreamer::PushAudioStreamTask, this);
    pthread_create(&t2, NULL, RtmpStreamer::PushVideoStreamTask, this);Copy the code

The main calls to PushVideoStreamTask are:

Get cached data from VideoCapture queue pVideoCapture->GetVideoData().
Calculate PTS: pVideoData-> PTS = pVideoData-> pts-beginTime.
Encoder complete coding :rtmpStreamer->videoEncoder->EncodeH264(&pVideoData).
RtmpStreamer ->SendFrame(pVideoData, rtmpStreamer->videoStreamIndex) completes the push stream.

This completes the whole process of coding and pushing the flow, so how is coding done? Because before open push flow, have been initialized encoder, so RtmpStreamer just call VideoEncoder coding, actually VideoCapture, RtmpStreamer pattern is both producers and consumers. VideoEncoder::EncodeH264(); It is the video coding that completes the important part before pushing the stream.

int VideoEncoder::EncodeH264(OriginData **originData) {
    av_image_fill_arrays(outputYUVFrame->data,
                         outputYUVFrame->linesize, (*originData)->data,
                         AV_PIX_FMT_YUV420P, videoCodecContext->width,
                         videoCodecContext->height, 1);
    outputYUVFrame->pts = (*originData)->pts;
    int ret = 0;
    ret = avcodec_send_frame(videoCodecContext, outputYUVFrame);
    if(ret ! = 0) {#ifdef SHOW_DEBUG_INFO
        LOG_D(DEBUG, "avcodec video send frame failed");
#endif
    }
    av_packet_unref(&videoPacket);
    ret = avcodec_receive_packet(videoCodecContext, &videoPacket);
    if(ret ! = 0) {#ifdef SHOW_DEBUG_INFO
        LOG_D(DEBUG, "avcodec video recieve packet failed");
#endif
    }
    (*originData)->Drop();
    (*originData)->avPacket = &videoPacket;
#ifdef SHOW_DEBUG_INFO
    LOG_D(DEBUG, "encode video packet size:%d pts:%lld", (*originData)->avPacket->size,
          (*originData)->avPacket->pts);
    LOG_D(DEBUG, "Video frame encode success!");
#endif
    (*originData)->avPacket->size;
    return videoPacket.size;
}Copy the code

The above is the core code of H264 encoding. Fill AVFrame and then complete encoding. AVFrame data stores the data before encoding, and AVPacket data stores the data after encoding. The encoded data is sent via RtmpStreamer::SendFrame(). During transmission, the PTS, DTS time base needs to be converted from the local encoder to the AVStream time base.


int RtmpStreamer::SendFrame(OriginData *pData, int streamIndex) {
    std::lock_guard<std::mutex> lk(mut1);
    AVRational stime;
    AVRational dtime;
    AVPacket *packet = pData->avPacket;
    packet->stream_index = streamIndex;
    LOG_D(DEBUG, "write packet index:%d index:%d pts:%lld", packet->stream_index, streamIndex, packet->pts); // Determine whether it is audio or videoif (packet->stream_index == videoStreamIndex) {
        stime = videoCodecContext->time_base;
        dtime = videoStream->time_base;
    }
    else if (packet->stream_index == audioStreamIndex) {
        stime = audioCodecContext->time_base;
        dtime = audioStream->time_base;
    }
    else {
        LOG_D(DEBUG, "unknow stream index");
        return- 1; } packet->pts = av_rescale_q(packet->pts, stime, dtime); packet->dts = av_rescale_q(packet->dts, stime, dtime); packet->duration = av_rescale_q(packet->duration, stime, dtime); int ret = av_interleaved_write_frame(iAvFormatContext, packet);if (ret == 0) {
        if (streamIndex == audioStreamIndex) {
            LOG_D(DEBUG, "---------->write @@@@@@@@@ frame success------->!");
        } else if (streamIndex == videoStreamIndex) {
            LOG_D(DEBUG, "---------->write ######### frame success------->!"); }}else {
        char buf[1024] = {0};
        av_strerror(ret, buf, sizeof(buf));
        LOG_D(DEBUG, "stream index %d writer frame failed! :%s", streamIndex, buf);
    }
    return 0;
}Copy the code

Above is MediaPlus H264 coding and Rtmp push stream of the whole process, related articles to be continued…… Limited ability, if there is a mistake, please correct.

Code address: github.com/javandoc/Me…

Android Camera acquisition, H264 coding and Rtmp push stream

Related Posts

Take you to see LiveData source code, not a little knowledge I lose

Implement OkHttp yourself

FFmpeg for Android audio and Video