Douyin is very hot at present, are you also interested in making a similar app?

I. Production of short video content

Production of high quality short video content depends on the short video acquisition and specific editing, this request in the trill APP development, use the basic skin care, mixing, filters, variable speed, image video mixing shear, subtitles, etc, on the basis of these features, pretreatment, combined with OpenGL, AI, AR technology, produce a lot of interesting dynamic stickers, Make short video content more creative.

The general realization process of video recording is to collect the most original Camera pictures and sounds by Camera and AudioRecord, and then preprocess the collected data for filtering and noise reduction. After the processing is completed, hardware coding is carried out by MediaCodec. Finally, MediaMuxer is used to generate the final MP4 files.

Two. Short video processing playback

Video processing and playback are mainly the experience of video clarity and viewing fluency. In this regard, the “narrowband HD” technology can be used to save bit rate and provide a clearer viewing experience. After testing, the highest bandwidth can be saved by 20-40% under the same video quality. In addition to bandwidth, storage of short video content and CDN optimization are also particularly important. Usually, short video content and cover content need to be uploaded to the cloud storage server.

CDN optimization brings short video platform is further short video first loading and loop playback experience. For example, for the problem of slow premiere, Ali Cloud player supports QUIC protocol and scheduling based on CDN, which can make the success rate of short video opening in seconds reach 98% for the first time playing. In addition, it can also play and cache during loop playing, so that users do not need to consume traffic when they repeatedly watch a short video.

3. Video recording method

In the Android system, if you need an Android device to get an MP4 video file, the mainstream way with three: MediaRecorder, MediaCodec+MediaMuxer, FFmpeg.

The MediaRecorder: Was recorded Android directly provided to our class, a class library for audio and video recording, simple and convenient, don’t need to ignore the middle recording process, can be directly after recording audio file for playback and recording of the audio file is compressed, you need to set the encoder, recorded audio files can use the system’s own player play.

Advantages: most and integration, directly call the relevant interface, small amount of code, simple and stable;

Cons: Can’t handle audio in real time; There are not many audio formats for output.

MediaCodec+MediaMuxer: The combination of MediaCodec and MediaMuxer can also achieve the recording function. MediaCodec is a codec class provided by Android, while MediaMuxer is a reuse class (generating video files). Certainly not as good as MediaRecorder in terms of ease of use, but allows us to do more flexible operations, such as the need to add watermarks to recorded video and other effects.

Advantages: As fast as MediaRecorder low power consumption, and more flexible

Disadvantages: Limited supported formats, compatibility issues

FFmpeg: FFmpeg (Fast forword mpeg) is an open source, free cross-platform video and audio streaming solution that provides complete solutions for recording/audio/video codec, conversion and streaming. The main function is to decompress the protocol, decapsulate, decode and transcode the multimedia data

Advantages: format support is very strong, very flexible, powerful, good compatibility;

Disadvantages: C language some audio and video coding and decoding procedures, the use is not very convenient.

Although FFmpeg is statistically the best, we have to rule it out first because it is the worst in terms of usability; Secondly, MediaRecorder is also need to exclude, so here I am more recommended MediaCodec+MediaMuxer this way.

4. Encoder parameters

Bit rate: The number of bits of data transmitted per unit of time in data transmission, KBPS: thousand bits per second. The bit rate is proportional to the mass, and also proportional to the volume of the file. When the bit rate exceeds a certain value, the image quality is not affected much.

Frames: How many frames are displayed per second, FPS

Key frame interval: in H.264 coding, there are many kinds of compressed image data output after coding, which can be simply divided into key frame and non-key frame. The key frame can be independently decoded and is regarded as the product of a compressed image. The non-key frame contains the “difference” information with other frames, which can also be called “reference frame”. Its decoding requires reference key frame to decode an image. Non-keyframes have higher compression rates.

Use of MediaCodec+MediaMuxer

MediaMuxer and MediaCodec these two classes, their reference article… And… There is a frame to use inside. This combination can achieve a lot of functions, such as audio and video file editing (combined with MediaExtractor), using OpenGL to draw Surface and generate MP4 files, screen video and similar Camera app in the video function (although this is more appropriate with MediaRecorder).

One of them generates video, one of them generates audio, so let’s combine them and generate both audio and video. The basic framework and process are as follows:

The first is audio threads, mainly HWEncoderExperiments. Receive the sampled data from the microphone through the AudioRecord class and throw it to Encoder for encoding:

AudioRecord audio_recorder; 
audio_recorder = new AudioRecord(MediaRecorder.AudioSource.MIC, 
// ... 
while (is_recording) { 
 byte[] this_buffer = new byte[frame_buffer_size]; 
 read_result =, 0, frame_buffer_size); // read audio raw data 
 / /...
 presentationTimeStamp = System.nanoTime() / 1000; 
 audioEncoder.offerAudioEncoder(this_buffer.clone(), presentationTimeStamp); // feed to audio encoder 


You can also set the AudioRecord callback (through setRecordPositionUpdateListener ()) to trigger the audio data is read. OfferAudioEncoder () is used to encode audio samples into the InputBuffer of audio MediaCodec:

ByteBuffer[] inputBuffers = mAudioEncoder.getInputBuffers(); 
int inputBufferIndex = mAudioEncoder.dequeueInputBuffer(- 1); 
if (inputBufferIndex >= 0) { ByteBuffer inputBuffer = inputBuffers[inputBufferIndex]; inputBuffer.clear(); inputBuffer.put(this_buffer); . mAudioEncoder.queueInputBuffer(inputBufferIndex,0, this_buffer.length, presentationTimeStamp, 0); 

Next, refer to Grafika-SoftInputSurfaceActivity and add audio processing. The main cycle can be divided into four parts:

try { 
 // Part 1 prepareEncoder(outputFile); .// Part 2 
 for (int i = 0; i < NUM_FRAMES; i++) { 
 // Part 3 . drainVideoEncoder(true); 
} catch (IOException ioe) { 
 throw new RuntimeException(ioe); 
} finally { 
 // Part 4 

MediaCodec initializes audio MediaCodec

MediaFormat audioFormat = new MediaFormat(); 
audioFormat.setInteger(MediaFormat.KEY_SAMPLE_RATE, 44100); 
audioFormat.setInteger(MediaFormat.KEY_CHANNEL_COUNT, 1); . mAudioEncoder = MediaCodec.createEncoderByType(AUDIO_MIME_TYPE); mAudioEncoder.configure(audioFormat,null.null, MediaCodec.CONFIGURE_FLAG_ENCODE); 

Part 2 enters the main loop. The app draws directly on the Surface. Since the Surface is applied from MediaCodec using createInputSurface(), there is no explicit queueInputBuffer() to Encoder after drawing. DrainVideoEncoder () and drainAudioEncoder() respectively pull the encoded audio and video out of the buffer (via dequeueOutputBuffer()), It is then handed over to MediaMuxer to mix (via writeSampleData()). Note that audio and video are synchronized through PTS (Presentation Time stamp, which determines when audio and video data of a certain frame will be displayed or played). The time stamp of audio should be obtained when AudioRecord collects data from MIC and put into the corresponding bufferInfo. Since the video is drawn on the Surface, we use the bufferInfo generated by dequeueOutputBuffer() directly, and finally send the encoded data to MediaMuxer for multiplexing.

Note that Muxer will not start until audio track and video track are added. MediaCodec returns an INFO_OUTPUT_FORMAT_CHANGED message when it first calls dequeueOutputBuffer(). We just need to get the format of the MediaCodec here and register it with the MediaMuxer. Then judge whether the current audio track and video track are ready, if so, start Muxer.

To sum up, the drainVideoEncoder() ‘s main logic is as follows: drainAudioEncoder works similarly, just switch the video MediaCodec to audio MediaCodec.

while(true) { 
 int encoderStatus = mVideoEncoder.dequeueOutputBuffer(mBufferInfo, TIMEOUT_USEC); 
 if (encoderStatus == MediaCodec.INFO_TRY_AGAIN_LATER) { 
 } else if (encoderStatus == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) { 
 encoderOutputBuffers = mVideoEncoder.getOutputBuffers(); 
 } else if (encoderStatus == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) { 
 MediaFormat newFormat = mAudioEncoder.getOutputFormat(); 
 mAudioTrackIndex = mMuxer.addTrack(newFormat); 
 if(mNumTracksAdded == TOTAL_NUM_TRACKS) { mMuxer.start(); }}else if (encoderStatus < 0) {... }else{ ByteBuffer encodedData = encoderOutputBuffers[encoderStatus]; .if(mBufferInfo.size ! =0) { 
 mMuxer.writeSampleData(mVideoTrackIndex, encodedData, mBufferInfo); 
 mVideoEncoder.releaseOutputBuffer(encoderStatus, false); 
 if((mBufferInfo.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) ! =0) { 
The drainVideoEncoder() and drainAudioEncoder lets drainAudioEncoder exit the inner loop based on EOS. The fourth part is the cleaning work. Release audio and Video MediaCodec, MediaCodec Surface and MediaMuxer objects.

A few final notes:

Create an AudioRecord object on androidmanifest.xml, otherwise it will fail:

<uses-permission android:name=”android.permission.RECORD_AUDIO”/>

2. Audio and video are synchronized through PTS, and the units of the two should be the same.

3. Use MediaMuxer in the Constructor -> addTrack -> start -> writeSampleData -> stop order. If you have both audio and video, both should pass writeSampleData() before stop.


The above is part of the content of Douyin APP. I have personally practiced the steps and processes mentioned above, and they should all work normally. It took me a lot of time to write this article, and I hope all the friends who read this article can gain some benefits. In addition, more details of Android short videos can be found in the attached materials below:

