From C/C ++ foundation, JNI foundation, C/C ++ advanced, data structure and algorithm, Linux kernel, CMake syntax, Shell script after a big circle, finally can write FFmpeg, above these foundations you can look at the previous article:

  • “Audio and Video Technology Learning – Starter”
  • What Android Developers Need to Know about Linux
  • CMake Syntax – A detailed explanation of cmakelists.txt
  • Shell Scripts – Build YOUR own FFmpeg

Today I will take you to realize a relatively common function, such as QQ Music, cool Dog music, netease cloud music, some applications, all involve music playback. First of all, we will introduce some functions that need to be used. At the beginning, we just need to know how to use them and what to do is basically enough. Gradually, we will go to the back to understand the source code, and then go to the deep handwriting algorithm. Let’s start with a flow chart:

  • Av_register_all () : initializes all components. The multiplexer and codec can only be used if this function is called.
  • Avformat_open_input ()/avformat_close_input(): The function reads the file header and resolves all boxes for MP4 files. But it saves the read results in the corresponding data structure. At this point, many fields in AVStream are blank.
  • Av_dump_format (): prints audio and video information
  • Avformat_find_stream_info () : Reads a portion of the audio-visual data and gets some relevant information, detects some important fields and tries to fill them in if they are blank. Avformat_find_stream_info is used to populate its members because we already have a lot of information when we parse the file headers. When all the important members are populated, the function returns. In this case, the function is efficient. But for some files, it is not enough to simply obtain information from the file header, for example, video pix_fmt needs to call H264_decode_frame to obtain its pix_fmt.
  • Av_find_best_stream (): gets stream_index of audio, video and subtitles. Before this function, we would normally write for loop.
  • Av_packet_free (): First, the reference technology of the data field pointed to by AVPacket is reduced by 1 (the reference technology of the data field is automatically released when reduced to 0). Then, the space allocated to AVPacket is freed.
  • Av_packet_unref (): Reduces the reference technology of the data field. When the reference technology is reduced to 0, the space occupied by the data field is automatically released.

1. Obtain audio Meta information

A few days ago, a student asked me a question about how TO obtain the sampling rate of an AMR format audio file and whether it is a single channel. It would have been easy to use FFMPEG, but it wouldn’t, so we had to use other third-party libraries. Of course, if you understand how it works, you can even analyze binaries yourself.

extern "C"
JNIEXPORT void JNICALL
Java_com_darren_ndk_day03_NativeMedia_printAudioInfo(JNIEnv *env, jclass j_cls, jstring url_) {
    const char *url = env->GetStringUTFChars(url_, 0);

    av_register_all();

    AVFormatContext *avFormatContext = NULL;
    int audio_stream_idx;
    AVStream *audio_stream;

    int open_res = avformat_open_input(&avFormatContext, absolutePath, NULL, NULL);
    if(open_res ! = 0) { LOGE("Can't open file: %s", av_err2str(open_res));
        return;
    }
    int find_stream_info_res = avformat_find_stream_info(avFormatContext, NULL);
    if (find_stream_info_res < 0) {
        LOGE("Find stream info error: %s", av_err2str(find_stream_info_res)); goto __avformat_close; } // Get the sampling rate and channel audio_STREAM_IDx = AV_finD_BEST_STREAM (avFormatContext, AVMediaType::AVMEDIA_TYPE_AUDIO, -1, -1, NULL, 0);if (audio_stream_idx < 0) {
        LOGE("Find audio stream info error: %s", av_err2str(find_stream_info_res));
        goto __avformat_close;
    }
    audio_stream = avFormatContext->streams[audio_stream_idx];
    LOGE("Sampling rate: %d, number of channels: %d", audio_stream->codecpar->sample_rate, audio_stream->codecpar->channels);

    __avformat_close:
    avformat_close_input(&avFormatContext);
    env->ReleaseStringUTFChars(url_, url);
}
Copy the code

2. Decode audio data

The avCODEC_decode_audio4 decoding functions are out of date, replaced by avCOdec_send_packet and AVCODEC_receive_frame.

// pCodecContext = av_format_context->streams[video_stream_idx]->codec; pCodecParameters = av_format_context->streams[audio_stream_idx]->codecpar; PCodec = avcodec_find_decoder(pCodecParameters->codec_id);if(! pCodec) { LOGE("Can't find audio decoder : %s", url); goto __avresource_close; } // initialize AVCodecContext pCodecContext = AVCODEC_alloc_context3 (pCodec); codecContextParametersRes = avcodec_parameters_to_context(pCodecContext, pCodecParameters);if (codecContextParametersRes < 0) {
        LOGE("codec parameters to_context error : %s, %s", url,av_err2str(codecContextParametersRes)); goto __avresource_close; } // open the decoder codecOpenRes = avCoDEC_open2 (pCodecContext, pCodec, NULL);if (codecOpenRes < 0) {
        LOGE("codec open error : %s, %s", url, av_err2str(codecOpenRes)); goto __avresource_close; } avPacket = av_packet_alloc(); avFrame = av_frame_alloc();while (av_read_frame(av_format_context, avPacket) >= 0) {
        if(audio_stream_idx == avPacket->stream_index) {// Write files, intercept, resampling, decode, etc avcodec_send_packet(pCodecContext, avPacket);if (sendPacketRes == 0) {
                receiveFrameRes = avcodec_receive_frame(pCodecContext, avFrame);
                if (receiveFrameRes == 0) {
                    LOGE("Decode frame % D", index); } } index++; } // av packet unref av_packet_unref(avPacket); av_frame_unref(avFrame); } / / release resources = = = = = = = = = = = = = = start av_packet_free (& avPacket); av_frame_free(&avFrame); __avresource_close:if(pCodecContext ! = NULL){ avcodec_close(pCodecContext); avcodec_free_context(&pCodecContext); }if(av_format_context ! = NULL) { avformat_close_input(&av_format_context); avformat_free_context(av_format_context); } env->ReleaseStringUTFChars(url_, url); // Release resources ============== endCopy the code

3. Play the audio

At present, there are two popular ways to play PCM data, one is to play it through Android AudioTrack, and the other is to play it through cross-platform OpenSLES. I prefer to use more efficient OpenSLES to play audio. You can take a look at Google’s official native Audio example first. Later, when we write music player, we will use OpenSLES to play audio. But here we still use AudioTrack to play


jobject initCreateAudioTrack(JNIEnv *env) {
    jclass jAudioTrackClass = env->FindClass("android/media/AudioTrack");
    jmethodID jAudioTrackCMid = env->GetMethodID(jAudioTrackClass, "<init>"."(IIIIII)V");

    //  public static final int STREAM_MUSIC = 3;
    int streamType = 3;
    int sampleRateInHz = 44100;
    // public static final int CHANNEL_OUT_STEREO = (CHANNEL_OUT_FRONT_LEFT | CHANNEL_OUT_FRONT_RIGHT);
    int channelConfig = (0x4 | 0x8);
    // public static final int ENCODING_PCM_16BIT = 2;
    int audioFormat = 2;
    // getMinBufferSize(int sampleRateInHz, int channelConfig, int audioFormat)
    jmethodID jGetMinBufferSizeMid = env->GetStaticMethodID(jAudioTrackClass, "getMinBufferSize"."(III)I");
    int bufferSizeInBytes = env->CallStaticIntMethod(jAudioTrackClass, jGetMinBufferSizeMid, sampleRateInHz, channelConfig, audioFormat);
    // public static final int MODE_STREAM = 1;
    int mode = 1;
    jobject jAudioTrack = env->NewObject(jAudioTrackClass, jAudioTrackCMid, streamType, sampleRateInHz, channelConfig, audioFormat, bufferSizeInBytes, mode);

    // play()
    jmethodID jPlayMid = env->GetMethodID(jAudioTrackClass, "play"."()V");
    env->CallVoidMethod(jAudioTrack, jPlayMid);

    returnjAudioTrack; } {/ / initializes the resampling = = = = = = = = = = = = = = = = = = = = = = = start swrContext swr_alloc (); // Input sampling formatinSampleFmt = pCodecContext->sample_fmt; // Output sampling format 16bit PCM outSampleFmt = AV_SAMPLE_FMT_S16; // Enter the sampling rateinSampleRate = pCodecContext->sample_rate; // outSampleRate = AUDIO_SAMPLE_RATE; // outSampleRate = AUDIO_SAMPLE_RATE; // Get the input channel layout // Get the default channel layout based on the number of channels (2 channels, default stereo)inChLayout = pCodecContext->channel_layout; // outChLayout = AV_CH_LAYOUT_STEREO; swr_alloc_set_opts(swrContext, outChLayout, outSampleFmt, outSampleRate,inChLayout,
            inSampleFmt, inSampleRate,0, NULL);

    resampleOutBuffer = (uint8_t *) av_malloc(AUDIO_SAMPLES_SIZE_PER_CHANNEL);
    outChannelNb = av_get_channel_layout_nb_channels(outChLayout);
    dataSize = av_samples_get_buffer_size(NULL, outChannelNb,pCodecContext->frame_size,
            outSampleFmt, 1);

    if (swr_init(swrContext) < 0) {
        LOGE("Failed to initialize the resampling context");
        return; } / / initializes the resampling = = = = = = = = = = = = = = = = = = = = = = end / / extraction of every frame audio stream avPacket = av_packet_alloc (); avFrame = av_frame_alloc();while (av_read_frame(av_format_context, avPacket) >= 0) {
        if(audio_stream_idx == avPacket->stream_index) {// Write files, intercept, resampling, decode, etc avcodec_send_packet(pCodecContext, avPacket);if (sendPacketRes == 0) {
                receiveFrameRes = avcodec_receive_frame(pCodecContext, avFrame);

                ifReceiveFrameRes == 0){// Insert data into the AudioTrack object avFrame->data -> jbyte; swr_convert(swrContext, &resampleOutBuffer, avFrame->nb_samples, (const uint8_t **) avFrame->data, avFrame->nb_samples);  jbyteArray jPcmDataArray = env->NewByteArray(dataSize); jbyte *jPcmData = env->GetByteArrayElements(jPcmDataArray, NULL); memcpy(jPcmData, resampleOutBuffer, dataSize); Env ->ReleaseByteArrayElements(jPcmDataArray, jPcmData, 0) // call java write env->CallIntMethod(audioTrack, jWriteMid, jPcmDataArray, 0, dataSize); Env ->DeleteLocalRef(jPcmDataArray); } } index++; } // av packet unref av_packet_unref(avPacket); av_frame_unref(avFrame); }}Copy the code
  • At the beginning of the source code, we generally can not thoroughly understand all aspects of the function, but through the source code analysis, can better understand the function of the function, the function to do things will also have a certain understanding. As for the ones you don’t understand, you can only save them for later research. For example, the avformat_find_stream_info() function computs start_time, baud rate, and other information. The time-dependent calculation is complicated, so if you don’t understand it at first, you can try it later.
  • I looked at some examples on the Internet, and they were basically the same set of code, and the runtime was going up slowly, of course I made similar mistakes early on. So I want to remind you that it’s good to understand how this works, that there may be some problems with the code that comes directly from the Internet, and there may be some problems with the code that I wrote, and you need to be a little bit more careful.
  • ** The above code is just an example, it can not be used in real projects, ** about multithreading decoding while playing, how to use OpenSL ES to play PCM data, how to add load, play, pause and other state functions, how to implement sound channel change, how to change speed and tone, how to seek, etc. Later we will be hand in hand to improve together, and will open source the code to my personal Github, for everyone’s reference and learning.

Video address: pan.baidu.com/s/1CbXdB9kz… Video password: CSE5