The original address of the original article, shall not be reproduced without the permission of the author

The autumn wind is clear, the autumn moon is bright leaves and leaves are hard to teach to dream

MediaLearn

Welcome to my project MediaLearn, which is established for the purpose of learning and sharing audio and video knowledge. Currently, it is only limited to the Android platform and will be gradually expanded in the future. Friends interested in audio and video field knowledge, welcome to learn together!!

In the last article Camera2 recording video (I) : Audio recording and encoding, mainly shared using Camera2 with MediaCodeC and MediaMuxer video recording audio recording part. So in this article, we start to analyze the use of MediaCodeC video recording coding and MediaMuxer to complete the Mux video composition module. I’ve also shared a video about hard coding video with MediaCodeC, which you can review by clicking on the portal below.

MediaCodeC hardcodes photo sets as video Mp4 files MediaCodeC hardcodes video MediaCodeC decodes video in its entirety and stores it as a picture file. Using two different methods, hard coding decodes video MediaCodeC decodes video specified frames hard coding decodes specified frames

An overview of the

The camera API used in the project is Camera2

Before the article begins, the same old rules, we will be results-oriented, comb through the process. Take a look at how the process works and what happens to the data during the video recording phase.

The process to comb

When the device’s camera is running, the sensor converts an optical signal into an electrical signal, which is then converted into a digital signal. The sensor will output four image formats: YUV, RGB, RAW RGB DATA, AND JPEG. YUV is the most commonly used format. The brightness signal in the output data of YUV is lossless, and RGB will have certain loss and lose some original information. RAW DATA, on the other hand, is the RAW information, but the storage space is larger and requires some special software to open. In the deprecated Camera API, the default preview callback data format is NV21. In Camera2, the function only provides Surface as a bridge object. If you want to get YUV, JPEG and RAW_SENSOR, you can use the Surface provided with ImageReader and listen to get the image information. Both Camera and Camera2 support Surface Settings. Through Surface, we can directly transfer the data obtained by the Camera to GPU for rendering processing through OpenGL. This saves a lot of time by eliminating the need to process Camera frame data in the CPU. Okay, I’m going to use a flow chart to show how MediaCodeC encodes Camera frame data.

  • 1. Since we need H264 data, we need to configure Mime as for MediaCodeCvideo/avc
  • 2. After MediaCodeC is configured, passcreateInputSurfaceCreate an input-surface as Input
  • 3. Set input-surface as parameter to configure windowSurface in EGL environment of Android platform
  • Create OpenGL program to get an available texture ID to build a SurfaceTexture. This object can be supplied to the deprecated CameraAPI, or a Surface can be constructed to provide to the Camera2API.

Through the above operations, we can directly transmit the data collected by the Camera to the GPU without laborious processing in the CPU.

Code implementation

As I mentioned in the last article, all of the features involved in video recording will be modularized for future reuse. So in the video recording coding part, I divide it into a Runnable — VideoRecorder. VideoRecorder’s internal responsibility is to encapsulate MediaCodeC+OpenGL coded processes. The Surface provides OpenGL textures and hardcodes byteBuffers, BufferInfo and other video frame-related information.

Configure (mediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE) val s = codec.createInputSurface() val surfaceTexture = encodeCore.buildEGLSurface(s) InputSurface = Surface(surfaceTexture) // Build a Surface with OpenGL textures, Readysurface.invoke (inputSurface) // Start encoding codec.start() // Start timer val startTime = system.nanotime () // Use arrays to keep the video recording thread in sync with the audio recording thread and the Mux threadwhile(isrecord.isnotempty ()) {// Encode data drainEncoder(falseFrameCount++ // OpenGL draw encodecore.draw () val curFrameTime = system.nanotime () -starttime } // drain encodecore. swapData(curFrameTime)} // drainEncoder(true)
Copy the code

The above pseudocode represents the entire process of video coding. First, we need to configure a proper MediaCodeC. The codec.createInputSurface function of MediaCodeC gets a Surface object, which I called InputSurface. Then configure the EGL environment and build the OpenGLProgram. All OpenGL related code is encapsulated in SurfaceEncodeCore class. SurfaceEncodeCore’s internal responsibilities are mainly: build EGL environment, configure OpenGL program, draw texture. OpenGL itself is not responsible for window management and context management, this function is provided by the respective platform. EGL is responsible for providing window management and context management for OpenGL in Android. In EGL, the output is rendered to the device screen using the EGLSurface. There are two ways to create an EGLSurface. One is to create a Surface that can actually be displayed through the eglCreateWindowSurface function, which takes a Surface as an argument. The other is to create an off-screen Surface with eglCreatePbufferSurface. At this point, the MediaCodeC input channel is set up, which will continuously receive Camera callback data during recording and process it through OpenGLProgram. All we need to do is extract a steady stream of encoded H264 bits from MediaCodeC and call back the video frame data. The drainEncoder function is implemented as follows:

fun MediaCodec.handleOutputBuffer(bufferInfo: MediaCodec.BufferInfo, defTimeOut: Long,
                                  formatChanged: () -> Unit = {},
                                  render: (bufferId: Int) -> Unit,
                                  needEnd: Boolean = true) {
    loopOut@ while (trueVal outputBufferId = dequeueOutputBuffer(bufferInfo, defTimeOut) log.d ("handleOutputBuffer"."output buffer id : $outputBufferId ")
        if (outputBufferId == MediaCodec.INFO_TRY_AGAIN_LATER) {
            if (needEnd) {
                break@loopOut
            }
        } else if (outputBufferId == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
            formatChanged.invoke()
        } else if (outputBufferId >= 0) {
            render.invoke(outputBufferId)
            if(bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM ! = 0) {break@loopOut
            }
        }
    }
}

private fun drainEncoder(isEnd: Boolean = false) {
        if (isEnd) {
            codec.signalEndOfInputStream()
        }
        codec.handleOutputBuffer(bufferInfo, 2500, {
            if(! isFormatChanged) { outputFormatChanged.invoke(codec.outputFormat) isFormatChanged =true
            }
        }, {
            val encodedData = codec.getOutputBuffer(it)
            if(bufferInfo.flags and MediaCodec.BUFFER_FLAG_CODEC_CONFIG ! = 0) { bufferInfo.size = 0 }if(bufferInfo.size ! = 0) { Log.d(TAG,"buffer info offset ${bufferInfo.offset} time is ${bufferInfo.presentationTimeUs} ")
                encodedData.position(bufferInfo.offset)
                encodedData.limit(bufferInfo.offset + bufferInfo.size)
                Log.d(TAG, "sent " + bufferInfo.size + " bytes to muxer")
                dataCallback.invoke(frameCount, bufferInfo.presentationTimeUs, bufferInfo, encodedData)
            }
            codec.releaseOutputBuffer(it, false)},! isEnd) }Copy the code

This is old code for MediaCodeC to process the output data and determine what state the encoder is in based on the ID returned by dequeueOutputBuffer. Then separately processed, will get the original data external callback.

Mixer Mux module

So far, the entire video recording function, video recording coding module to complete, audio recording coding module to complete, only need a Mux module. The data provided by the other two modules can be connected in series to output Mp4 files. In the Mux module, there is no technical content, the specific job is to maintain two data queues. One is the video frame queue and the other is the audio frame queue. The Mux module retrieves the head of the queue from both queues in an infinite loop. Then compare the time stamp size in video frame data and audio frame data, and encapsulate the small time first. See Muxer for the code

MediaMuxer = mediaMuxer (p, mediaMuxer. OutputFormat. MUXER_OUTPUT_MPEG_4) / / add video track val videoTrackId = mediaMuxer!! .addTrack(videoTrack) // Add audio track val audioTrackId = mediaMuxer!! .addTrack(audioTrack) mediaMuxer!! .start()whileIsrecord.isnotempty ()) {videoFrame = videoqueue.firstsafe val audioFrame = audioqueue.firstsafe val videoTime = videoFrame? .bufferInfo? .presentationTimeUs ? : -1L val audioTime = audioFrame? .bufferInfo? .presentationTimeUs ? : -1l // Compares the timestamps of audio and video framesif(videoTime == -1L && audioTime ! = -1L) { writeAudio(audioTrackId) }else if(audioTime == -1L && videoTime ! = -1L) { writeVideo(videoTrackId) }else if(audioTime ! = -1L && videoTime ! = -1l) {// Write smaller timestamp data firstif(audioTime < videoTime) {writeAudio(audioTrackId)}elseWriteVideo (videoTrackId)}}else {
            // do nothing
        }
}
Copy the code

Camera2 video size selection

All right. The whole video recording function has been all sorted out. Next, let’s analyze a video size selection problem, and the point in Camera2 that confused me ———— how to choose video size? In Android’s official documentation, you must rely on the Surface to access Camera2 camera data. To capture or stream images from a camera device, the application must first create a camera capture session with a set of output Surfaces for use with the camera device, With createCaptureSession (SessionConfiguration.). Camera2 will match the size of your Surface, Each Surface has to be pre-configured with an appropriate size and format (if applicable) to match the sizes and formats Available from the camera device, each Surface must be configured with the corresponding Size in advance to match the appropriate Size of Camera2. When Camera2 is configured, it returns a selection of sizes that represent all the sizes supported by the current camera on the device. When I was testing video recording, the test device returned the following size list:

OK, then. Now that I know what sizes are supported, WHEN I configure MediaCodeC, I can set the width and height from this and no further image processing is required. When MediaCodeC set the first size, the video didn’t distort. Other sizes, such as 720 X 960 and 720 X 1280, can be selected but severely deformed. But when I chose 1088 X 1088 or 960 X 960, which were the same width and height, I recorded the image without distortion. I don’t have a clue, because I don’t know the size mechanism of Surface matching, and I don’t know how to crop it when Drawing in OpenGL, which is a big problem. If you have friends who can solve this problem, I hope you can give me some suggestions. I took some detours [manual dog head] while doing video image processing. Before, I wrongly judged the source of Surface size, which led to incorrect logical judgment in the video size. In fact, the setDefaultBufferSize function of SurfaceTexture can be used to achieve size matching. However, Camera2 returns the size with the opposite width and height, so setDefaultBufferSize has the opposite width and height to match.

The above