1. Principle of OpenSL ES
OpenSL ES(Open Sound Library for Embedded Systems) is a non-licensing, cross-platform hardware audio acceleration API Library optimized for Embedded Systems. It for embedded mobile multimedia devices on the local application developer provides a standardized, high performance and low time corresponding audio development plan, and implement the direct cross-platform deployment of software/hardware audio performance, are widely used in 3 d audio, audio playback, audio recording and music experience enhancement (bass boost and environmental reverb), etc. For Android platform, we can use OpenSL ES library to process audio data directly in native layer, such as recording audio, playing audio, etc. The software/hardware hierarchy deployed in OpenSL ES embedded devices is shown in the following figure:
1.1 OpenSL ES core API description
Although OpenSL ES is developed based on C language, its implementation idea is based on object-oriented. Different from FFmpeg framework which provides a series of functional interfaces, OpenSL ES provides APIS in the way of Interface. That is, except for the function slCreateEngine, which is used to create the sound engine object interface, all operations are done by calling the interface’s member functions, much like JNI. For example, we call the OpenSL ES API like this:
SLObjectItf pEngineObject = NULL;
(*pEngineObject)->Realize(pEngineObject, SL_BOOLEAN_FALSE);
Copy the code
1.1.1 Objects and Interfaces
Object and Interface are two very important concepts in OpenSL ES library. The entire framework of OpenSL ES is based on these two concepts, and the subsequent use of the library is also based on these two concepts. The relationship is as follows: 1. Each Object may have one or more interfaces, and each Interface encapsulates related functions. For example, when we get an Audio Player object, we can get Audio playback Interface, volume Interface, cache queue Interface through the object, and then call the function functions of these interfaces to achieve Audio playback, volume adjustment and other functions.
// OpenSL ES engine Interface
SLEngineItf pEngineItf = NULL; . SLObjectItf pPlayerObject =NULL; // Audio Player object
SLPlayItf pPlayerItf = NULL; // Play interface
SLVolumeItf pVolumeItf = NULL; // Volume interface
SLAndroidSimpleBufferQueueItf pBufferItf = NULL; // Cache queue interface(*pEngineItf)->CreateAudioPlayer(pEngineItf,&pPlayerObject,..) ; (*pPlayerObject)->Realize(pPlayerObject,SL_BOOLEAN_FALSE); (*pPlayerObject)->GetInterface(pPlayerObject, SL_IID_PLAY,&pPlayerItf); (*pPlayerObject)->GetInterface(pPlayerObject, SL_IID_VOLUME,&pVolumeItf); (*pPlayerObject)->GetInterface(pPlayerObject,SL_IID_BUFFERQUEUE,&pBufferItf);Copy the code
2. Each Object provides some of the most basic “management” operations, such as Realize, Destory for allocating and releasing resources, Resume for ending the SL_OBJECT_STATE_SUSPENED state, and so on. If the system uses function functions supported by the object, it needs to obtain the corresponding Interface through the GetInterface function of the object, and then access function functions through the Interface Interface. Take adjusting the volume as an example:
// OpenSL ES engine Interface
SLEngineItf pEngineItf = NULL; . SLObjectItf pPlayerObject =NULL;
// First, create the Audio Player object(*pEngineItf)->CreateAudioPlayer(pEngineItf,&pPlayerObject,..) ;// Next, initialize the Audio Player object, which allocates resources
(*pPlayerObject)->Realize(pPlayerObject,SL_BOOLEAN_FALSE);
// Get the Volume Interface of the Audio Player object
(*pPlayerObject)->GetInterface(pPlayerObject, SL_IID_VOLUME,&pVolumeItf);
// Finally, call the Volume Interface function to adjust the Volume
(*pVolumeItf)->SetVolumeLevel(&pVolumeItf,level);
Copy the code
Note: Since the OpenSL ES library is cross-platform, not all platforms implement all the interfaces defined by an Object, so it is best to make some choices about which interfaces to obtain in order to avoid unexpected errors.
1.1.2 OpenSL ES state mechanism
OpenSL ES has an important conceptState mechanismsThat is, any OpenSL ES object will be entered after it is successfully createdSL_OBJECT_STATE_UNREALIZE
State where no resources are allocated to the object. The object is entered when the Realize() member function of the object is calledSL_OBJECT_STATE_REALIZE
In this state, all functions and resources of the object can be accessed normally. Of course, this object comes in when some sudden system event occurs, such as a system error or the Audio device being preempted by another applicationSL_OBJECT_STATE_SUSPENED
If we want to return to normal use, we need to call the object’s Resume function; Finally, we can call the object’s Destory function to release the resource, at which point the object’s state will return to SL_OBJECT_STATE_UNREALIZE. The process of OpenSL ES state mechanism transformation is as follows:
1.1.3 OpenSL ES Important Interfaces
(1) SLObjectItf: The Sound Library Object Interface (Sound Library Object Interface) is a generic Object that represents all objects in OpenSL ES, just like Java objects. SLEngineItf is the interface to the Engine Object. Create an object:
// Create an OpenSL Engine Object.
SLObjectItf pEngineObject = NULL;
slCreateEngine(&pEngineObject, 0.NULL.0.NULL.NULL);
// Create an OutputMix Object
SLObjectItf pOutputMixObject = NULL; (*pEngineItf)->CreateOutputMix(pEngineItf,&pOutputMixObject,...) ;// Create a Player Object
SLObjectItf pPlayerObject = NULL; (*pEngineItf)->CreateAudioPlayer(pEngineItf,&pPlayerObject,...) ;Copy the code
To destroy an object, call its Destory member function.
if (pEngineObject) {
(*pEngineObject)->Destroy(pEngineObject);
}
if (pOutputMixObject) {
(*pOutputMixObject)->Destroy(pOutputMixObject);
}
if (pPlayerObject) {
(*pPlayerObject)->Destroy(pPlayerObject);
}
Copy the code
The Engine Object is the entry point to the OpenSL ES API. It is the core Object in OpenSL ES and is used to manage the Audio Engine lifecycle and create all other objects in OpenSL ES. The Engine Object is created by the slCreateEngine function. The result is an interface to the Engine Object, SLEngineItf, which encapsulates the functions that create various other objects. The SLObjectItf structure is defined as follows and is located in… /SLES/ opensles.h
struct SLObjectItf_;
typedef const struct SLObjectItf_ * const * SLObjectItf;
struct SLObjectItf_ {
// Initialize the object, that is, allocate resources to the object
SLresult (*Realize) (
SLObjectItf self,
SLboolean async
);
// Restore the object to normal use, that is, end the SL_OBJECT_STATE_SUSPENED state
SLresult (*Resume) (
SLObjectItf self,
SLboolean async
);
// Get the state of the object
SLresult (*GetState) (
SLObjectItf self,
SLuint32 * pState
);
// Get the interface whose ID is iID
SLresult (*GetInterface) (
SLObjectItf self,
const SLInterfaceID iid,
void * pInterface
);
// Register the callback interface
SLresult (*RegisterCallback) (
SLObjectItf self,
slObjectCallback callback,
void * pContext
);
void (*AbortAsyncOperation) (
SLObjectItf self
);
// Destroy the current object to free resources
void (*Destroy) (
SLObjectItf self
);
// Set the priority
SLresult (*SetPriority) (
SLObjectItf self,
SLint32 priority,
SLboolean preemptable
);
SLresult (*GetPriority) (
SLObjectItf self,
SLint32 *pPriority,
SLboolean *pPreemptable
);
SLresult (*SetLossOfControlInterfaces) (
SLObjectItf self,
SLint16 numInterfaces,
SLInterfaceID * pInterfaceIDs,
SLboolean enabled
);
};
Copy the code
Note: besides SLEngineItf interface, Engine Object also provides SLEngineCapabilitiesItf SLAudioIODeviceCapabilitiesItf equipment and attribute information query interface, etc. (2) SLEngineItf: Short for Sound Library Engine Interface. Is the management interface provided by the Engine Object to create all other Object objects. The SLEngineItf structure is defined as follows and is located in… /SLES/ opensles.h
extern SL_API const SLInterfaceID SL_IID_ENGINE;
struct SLEngineItf_;
typedef const struct SLEngineItf_ * const * SLEngineItf;
struct SLEngineItf_ {
SLresult (*CreateLEDDevice) (
SLEngineItf self,
SLObjectItf * pDevice,
SLuint32 deviceID,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
SLresult (*CreateVibraDevice) (
SLEngineItf self,
SLObjectItf * pDevice,
SLuint32 deviceID,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
// Create a player object
SLresult (*CreateAudioPlayer) (
SLEngineItf self, // Enghine Interface
SLObjectItf * pPlayer, // Player Object
SLDataSource *pAudioSrc, // Audio data source can be URL, assets or PCM
SLDataSink *pAudioSnk, // Audio output
SLuint32 numInterfaces, // The number of PlayerObject interfaces to obtain
const SLInterfaceID * pInterfaceIds, //PlayerObject Indicates the ID of the interface to be supported
// Specify that each supported interface is an optional array of flag bits, and create an object if the required supported interface is not implemented
// will fail and return error code SL_RESULT_FEATURE_UNSUPPORTED
const SLboolean * pInterfaceRequired
);
// Create an audio recording object
SLresult (*CreateAudioRecorder) (
SLEngineItf self,
SLObjectItf * pRecorder,
SLDataSource *pAudioSrc,
SLDataSink *pAudioSnk,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
SLresult (*CreateMidiPlayer) (
SLEngineItf self,
SLObjectItf * pPlayer,
SLDataSource *pMIDISrc,
SLDataSource *pBankSrc,
SLDataSink *pAudioOutput,
SLDataSink *pVibra,
SLDataSink *pLEDArray,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
SLresult (*CreateListener) (
SLEngineItf self,
SLObjectItf * pListener,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
SLresult (*Create3DGroup) (
SLEngineItf self,
SLObjectItf * pGroup,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
// Create a mix output object
SLresult (*CreateOutputMix) (
SLEngineItf self, // Engine Interface
SLObjectItf * pMix,// OutputMix Object
SLuint32 numInterfaces, // The number of OutputMix Object interfaces to obtain
const SLInterfaceID * pInterfaceIds, // The object needs to support an array of interface ids
// Specify that each supported interface is an optional array of flag bits, and create an object if the required supported interface is not implemented
// will fail and return error code SL_RESULT_FEATURE_UNSUPPORTED
const SLboolean * pInterfaceRequired
);
SLresult (*CreateMetadataExtractor) (
SLEngineItf self,
SLObjectItf * pMetadataExtractor,
SLDataSource * pDataSource,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
SLresult (*CreateExtensionObject) (
SLEngineItf self,
SLObjectItf * pObject,
void * pParameters,
SLuint32 objectID,
SLuint32 numInterfaces,
const SLInterfaceID * pInterfaceIds,
const SLboolean * pInterfaceRequired
);
SLresult (*QueryNumSupportedInterfaces) (
SLEngineItf self,
SLuint32 objectID,
SLuint32 * pNumSupportedInterfaces
);
SLresult (*QuerySupportedInterfaces) (
SLEngineItf self,
SLuint32 objectID,
SLuint32 index,
SLInterfaceID * pInterfaceId
);
SLresult (*QueryNumSupportedExtensions) (
SLEngineItf self,
SLuint32 * pNumExtensions
);
SLresult (*QuerySupportedExtension) (
SLEngineItf self,
SLuint32 index,
SLchar * pExtensionName,
SLint16 * pNameLength
);
SLresult (*IsExtensionSupported) (
SLEngineItf self,
const SLchar * pExtensionName,
SLboolean * pSupported
);
};
Copy the code
(3) SLEnvironmentalReverbItf: Environmental reverb interface. Can be interpreted as specifying audio reverberation output effects, such as room effects, theater effects, auditorium effects, etc. . The SLEnvironmentalReverbItf structure is defined as follows and is located at… /SLES/ opensles.h
/ / interface ID
extern SL_API const SLInterfaceID SL_IID_ENVIRONMENTALREVERB;
struct SLEnvironmentalReverbItf_ {SLresult (*SetRoomLevel) ( SLEnvironmentalReverbItf self, SLmillibel room ); SLresult (*GetRoomLevel) ( SLEnvironmentalReverbItf self, SLmillibel *pRoom ); .// Set ambient reverb
SLresult (*SetEnvironmentalReverbProperties) (
SLEnvironmentalReverbItf self,
const SLEnvironmentalReverbSettings *pProperties
);
SLresult (*GetEnvironmentalReverbProperties) (
SLEnvironmentalReverbItf self,
SLEnvironmentalReverbSettings *pProperties
);
};
OpenSL ES provides ambient reverberation
// Cave effect
#defineSL_I3DL2_ENVIRONMENT_PRESET_CAVE \ {-1000, 0, 2910, 1300, -602, 15, -302, 22, 1000,1000}
#defineSL_I3DL2_ENVIRONMENT_PRESET_ARENA \ {-1000, -698, 7240, 330, -1166, 20, 16, 30, 1000,1000}
#defineSL_I3DL2_ENVIRONMENT_PRESET_HANGAR \ {-1000,-1000, 10050, 230, -602, 20, 198, 30, 1000,1000}
#defineSL_I3DL2_ENVIRONMENT_PRESET_CARPETEDHALLWAY \ {-1000,-4000, 300, 100, -1831, 2, -1630, 30, 1000,1000}
// Lobby effect
#defineSl_i3dl2_environment_preset_scene {-1000, -300, 1490, 590, -1219, 7, 441, 11, 1000,1000}
// Stone corridor effect
#defineSL_I3DL2_ENVIRONMENT_PRESET_STONECORRIDOR \ {-1000, -237, 2700, 790, -1214, 13, 395, 20, 1000,1000}./ / to omit
Copy the code
(4) SLPlayItf: playback interface. For example, when the setPlayState member function of the playing interface is called and the playing state is set to SL_PLAYSTATE_PLAYING, OpenSL SL will start playing audio. The SLPlayItf structure is defined as follows and is located in… /SLES/ opensles.h
// ID of the playback interface
extern SL_API const SLInterfaceID SL_IID_PLAY;
struct SLPlayItf_ {
// Set the playback state
// #define SL_PLAYSTATE_STOPPED ((SLuint32) 0x00000001)
// #define SL_PLAYSTATE_PAUSED ((SLuint32) 0x00000002)
// #define SL_PLAYSTATE_PLAYING ((SLuint32) 0x00000003)
SLresult (*SetPlayState) (
SLPlayItf self,
SLuint32 state
);
SLresult (*GetPlayState) (
SLPlayItf self,
SLuint32 *pState
);
// Get the playback duration
SLresult (*GetDuration) (
SLPlayItf self,
SLmillisecond *pMsec
);
// Get the playback positionSLresult (*GetPosition) ( SLPlayItf self, SLmillisecond *pMsec ); . };Copy the code
(5) SLAndroidSimpleBufferQueueItf: buffer queue interface. This interface is a buffer queue interface provided by Android NDK for Android. The /SLES/ opensles_android. h header defines its structure as follows:
// Cache queue interface ID
extern SL_API const SLInterfaceID SL_IID_ANDROIDSIMPLEBUFFERQUEUE;
struct SLAndroidSimpleBufferQueueItf_ {
// Insert data into the buffer queue
// After executing this function, the callback interface is automatically called back to process the next data
SLresult (*Enqueue) (
SLAndroidSimpleBufferQueueItf self,
const void *pBuffer,
SLuint32 size
);
// Clear the buffer queue
SLresult (*Clear) (
SLAndroidSimpleBufferQueueItf self
);
// Get the buffer queue status
SLresult (*GetState) (
SLAndroidSimpleBufferQueueItf self,
SLAndroidSimpleBufferQueueState *pState
);
// Register the callback interface callback
SLresult (*RegisterCallback) (
SLAndroidSimpleBufferQueueItf self,
slAndroidSimpleBufferQueueCallback callback,
void* pContext
);
};
Copy the code
Of course, we have only introduced a few common objects and interfaces here, but there are many other objects and interfaces in OpenSL ES that will be covered in more detail in the future.
1.2 OpenSL ES Usage Procedure
OpenSL ES uses a wide range of scenarios, and this article only introduces the scenarios where Audio is played using Audio Player objects. First, create OpenSL ES Engine Object and Engine Interface. The Audio Player object is then created using the engine interface SLEngineItf, and the Output Mix object is then associated with the Output Mix for Audio Output. Using the URI as an example, Output Mix is associated by default with the system-related default Output device,Schematic diagramAs follows:
Since the NDK native library has added support for the OpenSL ES library, it is very easy to use OpenSL ES to handle audio in Android development. Modify the cmakelist. TXT file as follows:
. # Find the NDK native libraryOpenSLES
find_library(openSLES-lib openSLES)# Link all libraries to Avstreamtarget_link_libraries( avstream ... ${openSLES-lib}).Copy the code
OpenSL ES Audio Playback steps:
1. Create the OpenSL ES Engine, that is, initialize the Engine Object and Engine Interface
SLObjectItf pEngineObject = NULL;
SLEngineItf pEngineItf = NULL;
// Create an Engine object
slCreateEngine(&pEngineObject, 0.NULL.0.NULL.NULL);
// Initialize the Engine object
(*pEngineObject)->Realize(pEngineObject, SL_BOOLEAN_FALSE);
// Get the Engine Interface of the Engine object.
(*pEngineObject)->GetInterface(pEngineObject, SL_IID_ENGINE,&pEngineItf);
Copy the code
2. Create a reverberation output object and specify the ambient reverberation effect and audio output
// Create an output mix object
SLInterfaceID effect[1] = {SL_IID_ENVIRONMENTALREVERB};
SLboolean boolValue[1] = {SL_BOOLEAN_FALSE};
(*pEngineItf)->CreateOutputMix(pEngineItf,&pOutputMixObject, 1, effect,
boolValue);
// Initialize the output mix object
(*pOutputMixObject)->Realize(pOutputMixObject,SL_BOOLEAN_FALSE);
// Get the Environmental Reverb interface
(*pOutputMixObject)->GetInterface(pOutputMixObject,SL_IID_ENVIRONMENTALREVERB,
&outputMixEnvironmentalReverb);
// Set ambient reverb to STONECORRIDOR
SLEnvironmentalReverbSettings reverbSettings = SL_I3DL2_ENVIRONMENT_PRESET_STONECORRIDOR;
(*outputMixEnvironmentalReverb)->SetEnvironmentalReverbProperties(
outputMixEnvironmentalReverb, &reverbSettings);
// Configure the audio output
SLDataLocator_OutputMix outputMix = {SL_DATALOCATOR_OUTPUTMIX, pOutputMixObject};
SLDataSink audioSnk = {&outputMix, NULL};
Copy the code
Among them, SLEnvironmentalReverbSettings and SLDataLocator_OutputMix, SLDataSink is structure, they are… The /SLES/ opensles.h header is defined as:
// Environment reverberation structure typedef struct SLEnvironmentalReverbSettings_ { SLmillibel roomLevel; SLmillibel roomHFLevel; SLmillisecond decayTime; SLpermille decayHFRatio; SLmillibel reflectionsLevel; SLmillisecond reflectionsDelay; SLmillibel reverbLevel; SLmillisecond reverbDelay; SLpermille diffusion; SLpermille density; } SLEnvironmentalReverbSettings; Copy the code
#define SL_I3DL2_ENVIRONMENT_PRESET_STONECORRIDOR {-1000, -237, 2700, 790, -1214, 13, 395, 20, 1000,1000}
// Mix the output locator structure typedef struct SLDataLocator_OutputMix { SLuint32 locatorType; // Fixed value: SL_DATALOCATOR_OUTPUTMIX SLObjectItf outputMix; // Mix output objects } SLDataLocator_OutputMix; Copy the code
// Audio data output structure typedef struct SLDataSink_ { void *pLocator; // SLDataLocator_OutputMix reference, which specifies the mixed output void *pFormat; // The format can be NULL } SLDataSink; Copy the code
3. Create an Audio Player object and obtain the playback interface of the object to control the playback status
SLDataLocator_AndroidSimpleBufferQueue android_queue = {SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 2};
SLDataFormat_PCM format_pcm = {
SL_DATAFORMAT_PCM, // Play PCM data
2.// Channel count, stereo
SL_SAMPLINGRATE_44_1, / / sampling rate
SL_PCMSAMPLEFORMAT_FIXED_16, // Sample depth
SL_PCMSAMPLEFORMAT_FIXED_16,
getChannelMask((SLuint32) nbChannels), // The previous work is right
SL_BYTEORDER_LITTLEENDIAN // End flag
};
// Specify the data source and location
SLDataSource pAudioSrc = {&android_queue, &format_pcm};
SLuint32 numInterfaces_audio = 2;
const SLInterfaceID ids_audio[3] = {SL_IID_BUFFERQUEUE, SL_IID_EFFECTSEND};
const SLboolean requireds_audio[3] = {SL_BOOLEAN_TRUE, SL_BOOLEAN_TRUE};
// Create the Audio Player object pPlayerObject
SLObjectItf pPlayerObject = NULL;
(*pEngineItf)->CreateAudioPlayer(
pEngineItf, // Engine Interface
&pPlayerObject, // Audio Player Object
&pAudioSrc, // The audio source is PCM
&audioSnk, // Audio output
numInterfaces_audio, // The number of interfaces the object should support: 2
ids_audio, // The ID of the interface to be supported by the object
requireds_audio); / / logo
// Initialize the pPlayerObject
(*pPlayerObject)->Realize(pPlayerObject,SL_BOOLEAN_FALSE);
// Get SLPlayItf of pPlayerObject
SLPlayItf pPlayerItf = NULL;
(*pPlayerObject)->GetInterface(pPlayerObject, SL_IID_PLAY,&pPlayerItf);
Copy the code
SLDataLocator_AndroidSimpleBufferQueue, SLDataFormat_PCM, and SLDataSource are the structures of… / SLES/openSLES_Android. H and… The /SLES/ opensles.h header is defined as:
// BufferQueue-based data locator definition
typedef struct SLDataLocator_AndroidSimpleBufferQueue {
SLuint32 locatorType; // Fixed value SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE
SLuint32 numBuffers; // The number of cached data
} SLDataLocator_AndroidSimpleBufferQueue;
// PCM data format structure
typedef struct SLDataFormat_PCM_ {
SLuint32 formatType;
SLuint32 numChannels;
SLuint32 samplesPerSec;
SLuint32 bitsPerSample;
SLuint32 containerSize;
SLuint32 channelMask;
SLuint32 endianness;
} SLDataFormat_PCM;
// Audio data source structure
typedef struct SLDataSource_ {
void *pLocator; // Data source locator
void *pFormat; // Data format
} SLDataSource;
Copy the code
4. Obtain the buffer queue interface of the Audio Player object and register the callback function
SLAndroidSimpleBufferQueueItf pBufferItf = NULL;
// Get the buffer interface
(*pPlayerObject)->GetInterface(pPlayerObject,SL_IID_BUFFERQUEUE,&pBufferItf);
// Register the callback interface. PcmBufferCallBack is a custom callback function
(*pBufferItf)->RegisterCallback(pBufferItf, pcmBufferCallBack,NULL);
Copy the code
5. Implement the callback function to write PCM data to the buffer queue
void pcmBufferCallBack(SLAndroidSimpleBufferQueueItf bf, void *context) {
PCMData pcmData;
PCMData *data = &pcmData;
SLresult result = NULL;
OpenSL ES will not call Enqueue if the data fails to be inserted
for(;;) {int ret = playQueueGet(&global_openSL.queue_play, data);
if(! isExit && ret >0) {
LOGI("start to play......");
// Insert PCM data into the buffer queue
// The callback function is automatically called after the block finishes playing
result = (*pBufferItf)->Enqueue(pBufferItf, data->pcm,data->size);
if (SL_RESULT_SUCCESS == result) {
LOGI("success to play......");
break; }}if(isExit) {
LOGI("stop to play......");
break; }}}Copy the code
6. Destroy objects to release resources
if (pPlayerObject) {
(*pPlayerItf)->SetPlayState(pPlayerItf, SL_PLAYSTATE_STOPPED);
(*pPlayerObject)->Destroy(pPlayerObject);
pPlayerObject = NULL;
pPlayerItf = NULL;
pVolumeItf = NULL;
}
if (pOutputMixObject) {
(*pOutputMixObject)->Destroy(pOutputMixObject);
pOutputMixObject = NULL;
pBufferItf = NULL;
outputMixEnvironmentalReverb = NULL;
}
if (pEngineObject) {
(*pEngineObject)->Destroy(global_openSL.pEngineObject);
pEngineObject = NULL;
pEngineItf = NULL;
}
Copy the code
2. FFmpeg audio decoding principle
2.1 FFmpeg related functions analysis
2.1.1 av_register_all ()/avcodec_register_all ()
void av_register_all(void);
void avcodec_register_all(void);
Copy the code
Av_register_all (void) is defined in… The /libavformat/avfotmat.h header initializes the libavformat module and registers all muxers /demuxers /protocols. Of course, we can also register the specified input or output format with the functions av_register_input_format() and av_register_output_format(); The avcodec_register_all function is defined in… The /libavcodec/avcodec.h header registers all codecs /parses /bitstream filters. Of course, we can optionally register with the avcodec_register(), av_register_codec_parser(), and AV_register_bitstream_filter functions.
2.1.2 avformat_alloc_context/avformat_free_context
AVFormatContext *avformat_alloc_context(void);
void avformat_free_context(AVFormatContext *s);
Copy the code
The avformat_alloc_context() and avformat_free_context() functions are defined in… /libavformat/avfotmat.h creates an AVFormatContext and allocates memory to it. The latter frees the created AVFormatContext structure and the various resources it occupies.
2.1.3 avformat_open_input/avformat_close_input ()
/** * open an input stream ** @param ps points to a pointer variable to the AVFormatContext structure used to store data related to the input stream; * @param URL Specifies the URL of the input stream. * @param FMT If the value is not NULL, the input format is specified. If NULL, the engine automatically detects format. * @param options ConfigureavFormatContext metadata * @return 0 success, <0 failure */
int avformat_open_input(AVFormatContext **ps, const char *url, AVInputFormat *fmt, AVDictionary **options);
/** * Close the open input AVFormatContext */
void avformat_close_input(AVFormatContext **s);
Copy the code
The avformat_open_input() and avformat_close_input() functions are defined in… The /libavformat/avfotmat.h header is used to open an input stream and read its header. Note that the codec is not started yet. The latter is used to close the input stream and release all resources associated with the stream. The avformat_open_input parameter is used to configure metadata. It is of type AVDictionary, which is defined in… /libavutil/dict.c
struct AVDictionary {
int count; / / the number of
AVDictionaryEntry *elems; / / entries
};
typedef struct AVDictionaryEntry {
char *key; / / key
char *value; / / value
} AVDictionaryEntry;
Copy the code
We can set an entry for AVDictionary using av_dict_set() and retrieve an entry from AVDictionary using av_dict_get(). For example, if you open an input stream using the avformat_open_input() function on a particularly bad network, RUL will “get stuck” there for a long period of time with no errors and no further execution, and Android’s upper layer can’t catch the FFmpeg part of the execution. The user experience is very poor. If avformat_open_input() fails to open the input stream, it will notify the Android layer for exception handling, and the experience will be very different. Example code is as follows:
AVDictionary *openOpts = NULL;
av_dict_set(&openOpts, "stimeout"."15000000".0); // Set the timeout option to 15s
avformat_open_input(&pFmtCtx, url, NULL, &openOpts);
Copy the code
Of course, in addition to “stimeout”, FFmpeg provides us with a variety of configurable options, such as “buffer_size” to set the input buffer size (in bytes); Rtsp_transport “Used to set the RTSP transport protocol type (udp by default);” B “is used to set the transmission bit rate and so on.
2.1.4 avformat_find_stream_info
/** * Read packets of a media file to get stream information. This * is useful for file formats with no headers such as @param options meta array options @return >=0 if OK, AVERROR_xxx on error */
int avformat_find_stream_info(AVFormatContext *ic, AVDictionary **options);
Copy the code
Resolution:
2.1.5 avcodec_find_decoder/avcodec_find_decoder_by_name
/** * decoder ** @param ID Decoder ID * @return If NULL, there is no decoder */ corresponding to the ID
AVCodec *avcodec_find_decoder(enum AVCodecID id);
** @param name Decoder name, such as "h264" * @return if NULL, there is no decoder corresponding to ID */
AVCodec *avcodec_find_decoder_by_name(const char *name);
Copy the code
The avcodec_find_decoder() and avcodec_find_decoder_by_name() functions are defined in… /libavcodec/avcodec.h /libavcodec/avcodec.h /libavcodec/avcodec.h /libavcodec/avcodec.h /libavcodec/avcodec.h
enum AVCodecID {
// video codecs
AV_CODEC_ID_MJPEG,
AV_CODEC_ID_H264,
...
// pcm
AV_CODEC_ID_PCM_S8,
AV_CODEC_ID_PCM_U8,
AV_CODEC_ID_PCM_ALAW,
...
// audio codecs
AV_CODEC_ID_AAC,
AV_CODEC_ID_MP3,
...
//subtitle codecs
AV_CODEC_ID_MOV_TEXT,
AV_CODEC_ID_HDMV_PGS_SUBTITLE,
AV_CODEC_ID_DVB_TELETEXT,
AV_CODEC_ID_SRT,
...
}
Copy the code
2.1.6 avcodec_alloc_context3 / avcodec_free_context
/** * Create an AVCodecContext for AVCodec, * * @param codec * @return An AVCodecContext filled with default values or NULL on failure. */
AVCodecContext *avcodec_alloc_context3(const AVCodec *codec);
/** * free AVCodecContext, and all resources associated with it, null free AVCodecContext reference */
void avcodec_free_context(AVCodecContext **avctx);
Copy the code
The avcodec_alloc_context3() and avcodec_free_context() functions are defined in… In the /libavcodec/avcodec.h header, the former is used to create an AVCodecContext for the codec; The latter is used to release the AVCodecContext reference and all resources associated with it.
2.1.7 avcodec_open2
/** * Initializes AVCodecContext with the given codec AVCodec, where AVCodecContext is obtained by * avcodec_alloc_context3(). Note: This function is not thread-safe * @param avctx the AVCodecContext that will be initialized * @param Codec codec * @param options AVCodecContext metadata * @return 0 for success */
int avcodec_open2(AVCodecContext *avctx, const AVCodec *codec, AVDictionary **options);
Copy the code
The avcodec_open2() function is defined in… /libavcodec/avcodec.h, which initializes the AVCodecContext with the given codec codec. AVCodecContext is obtained by avcodec_alloc_context3() and corresponds to codec, otherwise, avCODEC_open2 will fail. Here is the core code that shows how to open the specified decoder, as follows:
1. Register all codecs
avcodec_register_all();
// 2. Find AVCodec according to ID
codec = avcodec_find_decoder(AV_CODEC_ID_H264);
if(! codec)exit(1);
// 3. Create the AVCodecContext for the decoder
context = avcodec_alloc_context3(codec);
// 4. Initialize AVCodecContext, i.e. open the decoder
AVDictionary *opts = NULL;
av_dict_set(&opts, "b"."2.5 M".0);
if (avcodec_open2(context, codec, opts) < 0)
exit(1);
Copy the code
2.1.8 av_read_frame
/** * returns the next frame of the stream * @param s general file context * @param PKT packet, used to store a frame of data read * @return 0 for success, < 0 for error or end of file */
int av_read_frame(AVFormatContext *s, AVPacket *pkt);
Copy the code
The av_read_frame() function is defined in… The /libavformat/avformat.h header file used to read a frame from a stream and stored in the AVPacket structure. For video, PKT stores a complete frame of video data. For audio, PKT may contain an integer number of frames if each audio frame has a fixed size (such as PCM or ADPCM data). If the size of each audio frame is variable (such as MPEG data), then PKT contains a complete frame of data. Note that if PKT ->buf=NULL, PKT is valid until the next call to av_read_frame() or avformat_close_INPUT (); Otherwise, PKT will be valid forever. In addition, when we are done with the data in PKT, we must release the PKT by calling av_packet_unref().
2.1.9 avcodec_send_packet/avcodec_receive_frame
/** * Provides the decoder with a raw packet to decode ** @param avctx codec context * @param[in] avpkt input packet AVPacket, usually from av_read_frame(), Usually containing a video frame or several complete * audio frames * * @return 0 on success, otherwise failed * AVERROR(EAGAIN): Current state does not accept input * AVERROR_EOF: * AVERROR(EINVAL): Failed to open the decoder, or the encoder was opened, or needs to be flushed * AVERROR(ENOMEM): Failed to add AVPacket to the decoding queue * other errors: other exceptions */
int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt);
@param avctx Codec context @param frame Used to store audio or video frame data output by the decoder ** @return * 0: Success, returns a frame of decoded data * AVERROR(EAGAIN): Decoder output data is invalid, need to try to input new decoded data to decoder * AVERROR_EOF: marks decoding end * AVERROR(EINVAL): The decoder is not on, or the encoder is on * other negative values: Other error */
int avcodec_receive_frame(AVCodecContext *avctx, AVFrame *frame);
Copy the code
The avcodec_send_packet() and avcodec_receive_frame() functions are defined in… The /libavcodec/avcodec.h header is a new set of apis provided by FFmpeg to replace the decoding functions avcodec_decode_audio4()/avcodec_decode_video2(). The former is used to add a frame encoding data retrieved by av_read_frame() as input to the decoder; The latter is used to read a frame of decoded data from the decoder. Note that for some decoders, such as PCM, the avcodec_receive_frame() function will be called several times until 0 is returned to get the full decoded data, and when the AVFrame data is exhausted, The av_frame_unref() function must be called to reference all cache areas stored in AVFrame and reset the AVFrame structure fields.
2.1.10 swr_convert/swr_init
/** * Create a SwrContext structure, @param s can use Swr context or NULL * @param out_ch_layout output channel layout (AV_CH_LAYOUT_*) * @param out_sample_fmt Output sampling format (AV_SAMPLE_FMT_*). * @param out_SAMPLE_rate Output sampling rate (frequency in Hz) * @param in_CH_layout Input channel layout (AV_CH_LAYOUT_*) * @param in_sample_fmt Input sampling format (AV_SAMPLE_FMT_*). * @param in_SAMple_rate Input sampling rate (frequency in Hz) * @param log_offset is usually 0 * @param log_ctx is usually NULL * @return NULL indicates failure */
struct SwrContext *swr_alloc_set_opts(struct SwrContext *s,
int64_t out_ch_layout,
enum AVSampleFormat out_sample_fmt,
int out_sample_rate,
int64_t in_ch_layout,
enum AVSampleFormat in_sample_fmt,
int in_sample_rate,
int log_offset, void *log_ctx);
/** * Initialize SwrContext, * * @param[in,out] s Swr context to initialize * @return AVERROR error code in case of failure
int swr_init(struct SwrContext *s);
** @param[in] s a pointer to a Swr context */
void swr_free(struct SwrContext **s);
Copy the code
Swr_alloc_set_opts () and swr_init() are declared in… / libswresample swresample. H header file, among them, the swr_alloc_set_opts () is used to create a resampling required SwrContext structure, configuration parameters at the same time; Swr_init () is used to initialize the created SwrContext structure; . These two functions work together to initialize the resampling module, but there is also another way to do this. Example code is as follows:
SwrContext *swr = swr_alloc();
av_opt_set_channel_layout(swr, "in_channel_layout", AV_CH_LAYOUT_5POINT1, 0);
av_opt_set_channel_layout(swr, "out_channel_layout", AV_CH_LAYOUT_STEREO, 0);
av_opt_set_int(swr, "in_sample_rate".48000.0);
av_opt_set_int(swr, "out_sample_rate".44100.0);
av_opt_set_sample_fmt(swr, "in_sample_fmt", AV_SAMPLE_FMT_FLTP, 0);
av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16, 0);
swr_init(swr);
Copy the code
Note: When SwrContext is used, you need to call swr_free() to free resources.
2.1.10 swr_convert
/** convert audio ** @param s Swr context * @param out output cache * @param out_count The amount of space available in each channel for output samples * @param in input cache, * @param in_count Number of input samples available in each channel * * @return Number of output samples per channel; < 0 * / failure
int swr_convert(struct SwrContext *s,
uint8_t **out, int out_count,
const uint8_t **in , int in_count);
Copy the code
Swr_convert () is declared in… / libswresample swresample. H header file, it is used for audio data resampling, for meet the demand of audio data. For example, when ENCODING_PCM_16BIT(AV_SAMPLE_FMT_S16, signed integer 16 bits) is converted to AV_SAMPLE_FMT_FLTP(floating point mode) when FFmpeg is used to encode PCM audio recorded by an AudioRecord in AAC format. Because FFmpeg library is using AV_SAMPLE_FMT_FLTP this sample format for storage; For example, if we need to play the FFmpeg decoded PCM data, but the FFmpeg decoded data type is AV_SAMPLE_FMT_FLTP, and the player format is usually AV_SAMPLE_FMT_S16, Therefore, resampling is required using the swr_convert() function. In addition to the above two, the FFmpeg framework supports conversion of multiple sampling formats to each other, as follows:
/ / in... / libavutil/samplefmt. H header file
enum AVSampleFormat {
AV_SAMPLE_FMT_NONE = - 1,
AV_SAMPLE_FMT_U8, ///< unsigned 8 bits
AV_SAMPLE_FMT_S16, ///< signed 16 bits
AV_SAMPLE_FMT_S32, ///< signed 32 bits
AV_SAMPLE_FMT_FLT, ///< float
AV_SAMPLE_FMT_DBL, ///< double
AV_SAMPLE_FMT_U8P, ///< unsigned 8 bits, planar
AV_SAMPLE_FMT_S16P, ///< signed 16 bits, planar
AV_SAMPLE_FMT_S32P, ///< signed 32 bits, planar
AV_SAMPLE_FMT_FLTP, ///< float, planar
AV_SAMPLE_FMT_DBLP, ///< double, planar
AV_SAMPLE_FMT_S64, ///< signed 64 bits
AV_SAMPLE_FMT_S64P, ///< signed 64 bits, planar
AV_SAMPLE_FMT_NB ///< Number of sample formats.
};
Copy the code
2.2 FFmpeg audio decoding steps
Decoding AAC data in RTSP stream with FFmpeg is relatively simple, which can be summarized as the following steps:
2.2.1 Initialization Operations
1. Initialize FFmpeg engine. Register muxers, Demuxers, networks, codecs, etc
AVFormatContext *pFmtCtx = NULL; // Global context
AVCodecContext *pACodecCtx = NULL;// Codec context
AVCodec *pACodec = NULL; // codec
int index_audio = - 1; / / audio
AVFrame *pAFrame;
AVPacket *pPacket;
SwrContext *swrContext = NULL;
av_register_all();
avformat_network_init();
avcodec_register_all();
Copy the code
2. Decompress the protocol. That is, open the generalized input file, create and initialize the corresponding AVFormatContext structure, and get audio and video stream information, including bit rate, frame rate and audio track subscript.
/ / create AVFormatContext
pFmtCtx = avformat_alloc_context();
if (pFmtCtx == NULL) {
return - 1;
}
LOG_I("create ffmpeg for url = %s", url);
AVDictionary *openOpts = NULL;
av_dict_set(&openOpts, "stimeout"."15000000".0); // 15s timeout the connection is down
av_dict_set(&openOpts, "buffer_size"."1024000".0);// Reduce the bit rate to become large, resulting in the screen phenomenon
// Unbind the protocol, that is, initialize AVFormatContext
int ret = avformat_open_input(&pFmtCtx, url, NULL, &openOpts);
if (ret < 0) {
LOGE("open input failed in PlayAudio,timesout.");
releaseFFmpegEngine();
return - 1;
}
// Read packets from the input file to get stream information, such as bit rate, frame rate, etc
ret = avformat_find_stream_info(pFmtCtx, NULL);
if (ret < 0) {
LOGE("find stream info failed in PlayAudio.");
releaseFFmpegEngine();
return - 1;
}
// Get the audio subscript
for (int i = 0; i < pFmtCtx->nb_streams; i++) {
if (pFmtCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO) {
index_audio = i;
break; }}if (index_audio == - 1) {
LOGE("url have no audio stream in PlayAudio.");
releaseFFmpegEngine();
return - 1;
}
Copy the code
3. Obtain and open the decoder. According to the audio data encoding format information in AVFormatContext, find the corresponding decoder and create the corresponding AVCodecContext structure, then turn on the decoder.
// Get the decoder
AVCodecParameters *avCodecParameters = NULL;
avCodecParameters = pFmtCtx->streams[index_audio]->codecpar;
if (avCodecParameters == NULL) {
LOGE("get audio codec's AVCodecParameters failed.");
return - 1;
}
pACodec = avcodec_find_decoder(avCodecParameters->codec_id);
if(! pACodec) { LOG_E("do not find matched codec for %s", pFmtCtx->audio_codec->name);
releaseFFmpegEngine();
return - 1;
}
// Get the AVCodecContext corresponding to the decoder
pACodecCtx = avcodec_alloc_context3(pACodec);
if(! pACodecCtx) { LOGE("alloc AVCodecContext failed..");
return - 1;
}
avcodec_parameters_to_context(pACodecCtx, avCodecParameters);
// Open the decoder
ret = avcodec_open2(pACodecCtx, pACodec, NULL);
if (ret < 0) {
LOG_E("open %s codec failed.", pFmtCtx->audio_codec->name);
releaseFFmpegEngine();
return - 1;
}
Copy the code
4. Initialize the audio resampling module. Since FFmpeg decoded PCM data is AV_SAMPLE_FMT_FLTP, the audio format needs to be resampled to AV_SAMPLE_FMT_S16 for playback. The following code is used to create and initialize the resampling engine.
// Create swrContext and set parameters
swrContext = swr_alloc_set_opts(NULL, AV_CH_LAYOUT_STEREO, AV_SAMPLE_FMT_S16, pACodecCtx->sample_rate,
channel_layout,
sample_fmt, pACodecCtx->sample_rate, 0.NULL);
// Initialize swrContext
swr_init(swrContext);
Copy the code
2.2.2 Reading Stream Data
Read a frame of data from a stream and store it in the AVPacket structure, which stores AAC format audio data.
int readAVPacket(a) {
if (pACodecCtx == NULL) {
return - 1;
}
return av_read_frame(pFmtCtx, pPacket);
}
Copy the code
2.2.3 Decoding and resampling
The AAC audio is sent to the decoder, and then a frame of complete PCM data is obtained from the decoder and stored in the AVFrame structure. Finally, the PCM data (AV_SAMPLE_FMT_FLTP) is resampled to obtain playable PCM data (AV_SAMPLE_FMT_S16).
int decodeAudio(uint8_t **data) {
if(pPacket->stream_index ! = index_audio) {return - 1;
}
// Send an AVPacket(AAC) to the decoder
int ret = avcodec_send_packet(pACodecCtx, pPacket);
if(ret ! =0) {
return - 1;
}
// Get a frame of complete PCM audio data
while (avcodec_receive_frame(pACodecCtx, pAFrame) == 0) {
LOG_D("Read a frame of audio data,frameSize=%d", pAFrame->nb_samples);
break;
}
/ / re-sampling
uint8_t *a_out_buffer = (uint8_t *) av_malloc(2 * sample_rate);
swr_convert(swrContext, &a_out_buffer, sample_rate * 2,
(const uint8_t **) pAFrame->data,
pAFrame->nb_samples);
int outbuffer_size = av_samples_get_buffer_size(NULL,
av_get_channel_layout_nb_channels(AV_CH_LAYOUT_STEREO),
pAFrame->nb_samples,
AV_SAMPLE_FMT_S16P, 1);
*data = (uint8_t *)malloc(outbuffer_size);
memset(*data,0,outbuffer_size);
memcpy(*data,a_out_buffer,outbuffer_size);
free(a_out_buffer);
// Wipe the AVFrame
av_frame_unref(pAFrame);
// Wipe the AVPacket
av_packet_unref(pPacket);
return outbuffer_size;
}
Copy the code
2.2.4 Releasing Resources
// Close all streams and release AVFormatContext
if (pFmtCtx) {
avformat_close_input(&pFmtCtx);
avformat_free_context(pFmtCtx);
}
/ / release AVCodecContext
if (pACodecCtx) {
avcodec_free_context(&pACodecCtx);
}
if (pPacket) {
av_free(pPacket);
}
if (pAFrame) {
av_frame_free(&pAFrame);
}
if(swrContext) {
swr_free(&swrContext);
}
avformat_network_deinit();
Copy the code
3. Android play PCM audio project practice
The example project is to use FFmpeg and OpenSL ES engine to decode and play AAC audio. FFmpeg is used to parse RTSP network stream to get AAC data, and to decode and resampling AAC to get PCM data. The OpenSL ES engine is used to implement playing PCM audio streams in the native layer. The core idea of the project is to create two sub-threads, which are used to decode and play audio respectively. Create two linked lists, one for storing decoded data and one for playing data. Its flow chart looks like this:
3.1 Decoding Thread
1. Initialize FFmpge engine, initialize linked list, and start audio player thread; 2. Use FFmpeg to read AVPacket in network flow circularly, and its data format is AAC(or other encoding and compression format), and then decode and resampling to get PCM data; 3. Insert decoded PCM data into the linked list to wait for the player thread to read and play; 4. Stop decoding and release resources.
Decode_audio_thread function source code as follows:
void *decode_audio_thread(void *argv) {
quit = 0;
// Initialize the FFmpeg engine
int ret = createFFmpegEngine((const char *) argv);
if (ret < 0) {
LOGE("create FFmpeg Engine failed in decode_audio_thread");
return NULL;
}
// Initialize the linked list
queue_pcm_init(&pcm_queue);
PCMPacket *pcmPkt = (PCMPacket *) malloc(sizeof(PCMPacket));
// Start the audio player thread
pthread_t threadId_play;
pthread_create(&threadId_play, NULL, play_audio_thread, NULL);
while (readAVPacket() >= 0) {
// Thread termination flag
if (quit) {
break;
}
/ / decoding
uint8_t *data = NULL;
int nb_samples = decodeAudio(&data);
// Insert to queue
if (nb_samples > 0&& data ! =NULL) {
pcmPkt->pcm = (char *) data;
pcmPkt->size = nb_samples;
queue_pcm_put(&pcm_queue, pcmPkt);
}
}
releaseFFmpegEngine();
free(pcmPkt);
return NULL;
}
Copy the code
3.2 Player Thread
1. Initialize the playlist; 2. Start and initialize the OPenSL ES thread. This thread completes the initialization of the OPenSL ES engine. OpenSL ES plays audio through a callback function that loops through the thread reading PCM data. 3. Read the PCM linked list circularly and store the read data in the playlist. Play_audio_thread
pthread_t threadId_open_opensl;
void *play_audio_thread(void *argv) {
PCMPacket pcmPacket;
PCMPacket *pkt = &pcmPacket;
// Initializes the playlist
playQueueInit(&global_openSL.queue_play);
// Start initializing the OpenSL ES thread
pthread_create(&threadId_open_opensl,NULL,thread_open_opensl,NULL);
// Loop through the PCM list and store the read data in the playlist
for (;;) {
if (quit) {
break;
}
if (queue_pcm_get(&pcm_queue, pkt) > 0) {
// Write data
PCMData *pcmData = (PCMData *) malloc(sizeof(PCMData)); pcmData->pcm = pkt->pcm; pcmData->size = pkt->size; playQueuePut(&global_openSL.queue_play,pcmData); }}}void * thread_open_opensl(void * argv) {
// Initialize the OpenSL ES engine
int ret = createOpenSLEngine(channels, sample_rate,sample_fmt);
if (ret < 0) {
quit = 1;
LOGE("create OpenSL Engine failed in play_audio_thread");
return NULL;
}
pthread_exit(&threadId_open_opensl);
}
Copy the code
Github source: DemoOpenSLES, welcome star & Issues