Android provides four levels of audio apis:
- Java layer MediaRecorder&MediaPlayer series;
- Java layer AudioTrack&AudioRecorder series;
- Jni layer opensles;
- JNI Layer AAudio (Android O introduction)
The usage and features of these apis are described below.
1. MediaRecorder&MediaPlayer
MediaRecorder and MediaPlayer are not the complete audio API, they are only the system audio API package, in addition to capture/play, they integrate encoding/decoding, reuse/dereuse and other capabilities. They’re still calling AudioRecorder, AudioTrack at the bottom. The following describes their main configuration items.
1.1 the MediaRecorder
MediaRecorder because has integrated recording, compression coding, packaging multiplexing and other functions, so the use of relatively simple.
MediaRecorder use relatively simple, the main Settings of the following:
- SetAudioSource:
- Audiosource. DEFAULT: DEFAULT AudioSource;
- AudioSource: MIC;
- AudioSource. VOICE_UPLINK: upward telephone recording, the android Manifest. Permission# CAPTURE_AUDIO_OUTPUT;
- AudioSource. VOICE_DOWNLINK: downward telephone recording, the android Manifest. Permission# CAPTURE_AUDIO_OUTPUT;
- TDD AudioSource. VOICE_CALL: has the telephone recording, the android Manifest. Permission# CAPTURE_AUDIO_OUTPUT;
- Audiosource. CAMCORDER: Set the recording to come from the same camera with the same microphone. If the camera has no built-in camera or cannot be recognized, use the preset microphone
- Audiosource. VOICE_RECOGNITION: used for speech recognition;
- AudioSource.VOICE_COMMUNICATION: Used for voice calls.
- 3. AudioSource: UNPRECESSED;
- Audiosource. VOICE_PERFORMANCE: Low latency for real-time audio processing;
- AudioSource. REMOTE_SUBMIX: used for audio stream to the remote transmission system mixing, android. The Manifest. Permission. CAPTURE_AUDIO_OUTPUT;
- Reference signal AudioSource. ECHO_REFERENCE: echo suppression, SystemApi, android. The Manifest. Permission. CAPTURE_AUDIO_OUTPUT;
- Audiosource.radio_tuner: radio broadcast sound, SystemApi;
- Audiosource. HOTWORD: Preemptive HOTWORD detection, SystemApi.
- The encoder setAudioEncoder
- Audio. Encoder. DEFAULT:
- Audio. Encoder. The AMR_NB:
- Audio. Encoder. AMR_WB:
- Audio. Encoder. AAC:
- Audio. Encoder. HE_AAC:
- Audio. Encoder. AAC_ELD: 4.1 +
- Audio. Encoder. VORBIS:
- Audio. Encoder. OPUS: 10 + Android
- Multiplexer setOutputFormat
- OutputFormat.DEFAULT
- OutputFormat. THREE_GPP: 3 gp;
- OutputFormat. MPEG_4: mp4;
- Outputformat.raw_amr:.aac or.amr;
- Outputformat. AMR_NB: amr nb;
- Outputformat. AMR_WB: amr WB;
- Outputformat. AAC_ADIF: aAC adiF;
- Outputformat. AAC_ADTS: aAC ADTS;
- Outputformat. OUTPUT_FORMAT_RTP_AVP: network stream;
- Outputformat. MPEG_2_TS: TS encapsulation;
- Outputformat. WEBM: WEBM container;
- Outputformat. HEIF: HEIF container;
- Outputformat. OGG: OGG container.
- The output file path setOutputFile
Sample code:
MediaRecorder recorder = new MediaRecorder();
recorder.setAudioSource(MediaRecorder.AudioSource.MIC);
recorder.setOutputFormat(MediaRecorder.OutputFormat.THREE_GPP);
recorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);
recorder.setOutputFile(PATH_NAME);
recorder.prepare();
recorder.start(); // Recording is now started
...
recorder.stop();
recorder.reset(); // You can reuse the object by going back to setAudioSource() step
recorder.release(); // Now the object cannot be reused
Copy the code
The above code is only the basic use, specific use also needs to combine the specific needs of the project to develop specific logic, but MediaRecorder use when the need to instantiate, so in no time must remember the immediate release, so as not to cause memory leakage.
MediaRecorder status map:
Summary: MediaRecorder recording is relatively simple, the amount of code is relatively less, more concise, but there are shortcomings, such as output file format selection is less, recording process can not be suspended.
1.2 MediaPlayer
Examples of MediaPlayer usage:
MediaPlayer mediaPlayer = new MediaPlayer();
if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.LOLLIPOP) {
AudioAttributes audioAttributes = new AudioAttributes.Builder()
.setUsage(AudioAttributes.USAGE_ALARM)
.setContentType(AudioAttributes.CONTENT_TYPE_SPEECH)
.build();
mediaPlayer.setAudioAttributes(audioAttributes);
} else {
mediaPlayer.setAudioStreamType(AudioManager.STREAM_ALARM);
}
mediaPlayer.reset();
mediaPlayer.setOnCompletionListener(new MediaPlayer.OnCompletionListener() {
@Override
public void onCompletion(MediaPlayer mp) {
// Play finished}}); mediaPlayer.setDataSource(path); mediaPlayer.prepare(); mediaPlayer.start();/ / or mediaPlayer. PrepareAsync () and in setOnPreparedListener (android. Media. The mediaPlayer. OnPreparedListener) set the start in the callback
Copy the code
1.2.1 Status Diagram
1.2.2setAudioStreamType options:
- Audiosystem. STREAM_VOICE_CALL: a telephone call;
- Audiosystem.stream_system: System sound;
- Audiosystem.stream_ring: phone ring;
- Audiosystem.stream_music: Play music;
- Audiosystem.stream_alarm: alarm clock;
- AudioSystem. STREAM_NOTIFICATION: notice
AudioAttributes replace Stream types. AudioAttributes can set more attributes than stream types. There are three aspects:
- Usage (why) : Why is this sound played?
- Content Type (what) : What is playing? It is optional. Some usage, such as CONTENT_TYPE_MOVIE, is movie usage.
- Flags (How) : How to affect playback.
1.2.3 setDataSource instructions
SetDataSource Sets the playback source. It can be a local or network file. If it is a network file or a local large file, calling prepare in the main thread may cause ANR, so try prepareAsync and then start in the onPrepared callback.
2. AudioRecorder&AudioTrack
2.1 AudioRecorder
The main function of the AndioRecord class is to enable various Java applications to manage audio resources so that they can input the collected sound through such platform-capable sound hardware. Pulling is done by “reading” the audio data of the AudioRecord object. During recording, all the application needs to do is get the recording data from the AudioRecord object in time through one of the following three class methods. The AudioRecord class provides three methods for retrieving sound data: read(byte[], int, int), read(short[], int, int), and read(ByteBuffer, int). Whichever method you choose to use must be set up in advance to store the sound data in a user-friendly format.
To begin recording, an AudioRecord initializes an associated sound buffer, which is used to store new sound data. The size of this buffer can be specified during object construction. It indicates how long an AudioRecord object can record (that is, the volume of sound that can be recorded at one time) before the sound data has been read (synchronized). The audio data is read from the audio hardware. The data size does not exceed the size of the entire recording data (it can be read multiple times), that is, the initial buffer capacity is read at a time.
The collection is simple, we just need to construct an AudioRecord object, and then pass in various configurations of the parameters. In general, the simple process of recording is as follows:
- Audio source: We can use the microphone as a source of audio collection, optional parameters consistent with MediaRecorder.
- Sampling rate: the number of times of sampling sound data per second. The higher the sampling rate, the better the sound quality.
- Audio channel: mono channel, double channel, etc.
- Audio format: PCM format is generally used, that is, original audio samples.
- Buffer size: audio data to the total number of buffers, can through the AudioRecord. GetMinBufferSize obtain minimum buffer. (To capture audio and then read from the buffer).
The code implementation is as follows:
Public class AudioRecorder {/ / audio input to MIC private final static int AUDIO_INPUT = MediaRecorder. AudioSource. MIC. //44100 is the current standard, but some devices still support 22050,16000, Private final static int AUDIO_SAMPLE_RATE = 16000; private final static int AUDIO_SAMPLE_RATE = 16000; Private final static int audioformat.channel_in_mono; private final static int audioformat.channel_in_mono; Private final static int AudioFormat = Audioformat.encoDING_pcM_16bit; Public void createDefaultAudio () {/ / receive buffer int bufferSizeInBytes = AudioRecord. GetMinBufferSize (AUDIO_SAMPLE_RATE, AUDIO_CHANNEL, AUDIO_ENCODING); AudioRecord audioRecord = new AudioRecord(AUDIO_INPUT, AUDIO_SAMPLE_RATE, AUDIO_CHANNEL, AUDIO_ENCODING, bufferSizeInBytes); audioRecord.startRecording(); new Thread(new Runnable() { @Override public void run() { readsize = audioRecord.read(audiodata, 0, bufferSizeInBytes); if (AudioRecord.ERROR_INVALID_OPERATION ! = readsize) { //handle audio data } } }).start(); }}Copy the code
2.2 AudioTrack
AudioTrack is an interface provided by the Java layer to manage and play individual audio resources. It is used to play PCM raw data. Playback is implemented by writing data using one of the write(byte[], int, int), write(short[], int, int), and write(float[], int, int, int) methods. AudioTrack instances can run in two modes: static mode or stream mode.
-
In stream mode, the application writes a continuous stream of data to AudioTrack using one of the write() methods. When data is transferred from the Java layer to the Native layer and queued for playback, they block and return. Streaming mode is most useful when playing blocks of audio data, for example: the sound takes too long to fit in memory; Due to the characteristics of audio data (high sampling rate, number of bits per sample…) ; To receive or generate while playing previously queued audio.
-
Static mode is used to handle short sound scenes that fit in memory and need to play with minimal delay. As a result, static mode is better suited to the UI and sounds of the game, with minimal overhead.
When an AudioTrack object is created, it initializes its associated audio buffer. The size of this buffer specified during construction determines how long AudioTrack can play before running out of data. For an AudioTrack in static mode, this size is the maximum size of sound it can play. For stream mode, data is written to the audio receiver in blocks smaller than or equal to the total buffer size. AudioTrack is not final, so subclasses are allowed, but not recommended.
Static mode playback example:
AudioAttributes audioAttributes = new AudioAttributes.Builder() .setUsage(AudioAttributes.USAGE_MEDIA) .setContentType(AudioAttributes.CONTENT_TYPE_MUSIC) .build(); AudioFormat auidoFormat = new AudioFormat.Builder().setSampleRate(22050) .setEncoding(AudioFormat.ENCODING_PCM_8BIT) .setChannelMask(AudioFormat.CHANNEL_OUT_MONO) .build(); AudioTrack audioTrack = new AudioTrack(audioAttributes,auidoFormat,audioData.length,AudioTrack.MODE_STATIC, AudioManager.AUDIO_SESSION_ID_GENERATE); audioTrack.write(audioData, 0, audioData.length); If (Audiotrack.getState () == Audiotrack.state_uninitialized){toast.maketext (this," audioTrack initialization failed!" ,Toast.LENGTH_SHORT).show(); return; } audioTrack.play();Copy the code
Example of stream mode playback:
/ / * * * * * * * * * * * * stream player * * * * * * * * * * * * final int minBufferSize = AudioTrack. GetMinBufferSize (SAMPLE_RATE_INHZ, AudioFormat.CHANNEL_OUT_MONO, AUDIO_FORMAT); AudioAttributes audioAttributes = new AudioAttributes.Builder() .setUsage(AudioAttributes.USAGE_MEDIA) .setContentType(AudioAttributes.CONTENT_TYPE_MUSIC) .build(); AudioFormat auidoFormat = new AudioFormat.Builder().setSampleRate(22050) .setEncoding(AudioFormat.ENCODING_PCM_8BIT) .setChannelMask(AudioFormat.CHANNEL_OUT_MONO) .build(); audioTrack = new AudioTrack(audioAttributes,auidoFormat,minBufferSize,AudioTrack.MODE_STREAM,AudioManager.AUDIO_SESSION_ID_GENERATE); If (Audiotrack.getState () == Audiotrack.state_uninitialized){toast.maketext (this," audioTrack failed to initialize! ,Toast.LENGTH_SHORT).show(); return; } // Play audiotrack.play (); Post (new Runnable() {@override public void run() {try {final File File = new File(getExternalFilesDir(Environment.DIRECTORY_MUSIC), "test.pcm"); FileInputStream fileInputStream = new FileInputStream(file); byte[] tempBuffer = new byte[minBufferSize]; while (fileInputStream.available() > 0) { int readCount = fileInputStream.read(tempBuffer); if (readCount == AudioTrack.ERROR_INVALID_OPERATION || readCount == AudioTrack.ERROR_BAD_VALUE) { continue; } if (readCount ! = 0 && readCount ! = -1) { audioTrack.write(tempBuffer, 0, readCount); } } fileInputStream.close(); } catch (IOException ioe) { ioe.printStackTrace(); }}});Copy the code
3. OpenSL ES
OpenSL ES (Open Sound Library for Embedded Systems) is a non-licensing, cross-platform, carefully optimized hardware audio acceleration API for Embedded Systems. It provides native application developers on embedded mobile multimedia devices with standardized, high performance, low response time implementation methods for audio functionality, and enables direct cross-platform deployment of software/hardware audio performance, reducing execution difficulty and promoting the development of the advanced audio market. OpenSL ES is an embedded, cross-platform, free audio processing library. IOS unexposed OpenSL ES interface, Android OpenSL ES repository is in the NDK platforms folder corresponding to the Android platform inside the corresponding CPU type first:
$ ls ~/Library/Android/ndk/ndk-bundle/android-ndk-r17c/platforms/android-28/arch-arm/usr/lib/
crtbegin_dynamic.o libGLESv3.so libcompiler_rt-extras.a libnativewindow.so
crtbegin_so.o libOpenMAXAL.so libdl.a libneuralnetworks.so
crtbegin_static.o libOpenSLES.so libdl.so libstdc++.a
crtend_android.o libaaudio.so libjnigraphics.so libstdc++.so
crtend_so.o libandroid.so liblog.so libsync.so
libEGL.so libc.a libm.a libvulkan.so
libGLESv1_CM.so libc.so libm.so libz.a
libGLESv2.so libcamera2ndk.so libmediandk.so libz.so
Copy the code
The Android 2.3 (API) that began to support OpenSL ES standard, through the NDK provides the corresponding API development interface (reference: source.android.com/devices/aud… The implementation of OpenSL ES is a subset of OpenSL 1.0.1, and has been extended. Therefore, we need to pay special attention to the use of OpenSL ES API, which is supported and not supported by Android. The address of the relevant documents is in the NDK docs directory:
NDKroot/docs/Additional_library_docs/opensles/index.html NDKroot/docs/Additional_library_docs opensles/OpenSL_ES_Specification_1. 0.1 PDFCopy the code
OpenSL ES has two concepts that must be understood: Object and Interface. Object can be thought of as Java’s Object class, and Interface can be thought of as Java’s Interface, but they are not exactly the same. In order to implement object-oriented interfaces in a process-oriented language, the following is a further explanation of their relationship: (1) Each Object may have one or more interfaces, and a series of interfaces are officially defined for each Object. (2) Each Object provides some basic operations, such as: Realize, Resume, GetState, Destroy, etc. If you want to use the function supported by this object, you must use the GetInterface function to get the Interface Interface. (3) Not all interfaces defined by OpenSL ES for Object are implemented on every system, so some choices and judgments need to be made when acquiring interfaces
3.1 Common objects and structures
In OpenSL ES, all API access and control are completed through the Interface, and even the Object in OpenSL ES is accessed and used through the SLObjectItf Interface.
3.1.1 Engine Object and SLEngineItf Interface
The Engine Object is the core Object of OpenSL ES. It mainly provides the following two functions:
(1) Manage the life cycle of Audio Engine
(2) Provide a management interface: SLEngineItf, which can be used to create all other objects
(3) to provide equipment attribute query interface: SLEngineCapabilitiesItf and SLAudioIODeviceCapabilitiesItf, these interfaces some attributes of the device can query information
Create an Engine Object as follows:
SLObjectItf engineObject;
slCreateEngine( &engineObject, 0.nullptr.0.nullptr.nullptr );1.2.
Copy the code
Initialize/destroy:
(*engineObject)->Realize(engineObject, SL_BOOLEAN_FALSE); (*engineObject)->Destroy(engineObject);1.2.
Copy the code
Obtain the management interface:
SLEngineItf engineEngine; (*engineObject)->GetInterface(engineObject, SL_IID_ENGINE, &(engineEngine));1.2.
Copy the code
SLObjectItf structure definition:
struct SLObjectItf_ { SLresult (*Realize) (SLObjectItf self,SLboolean async); SLresult (*Resume) (SLObjectItf self,SLboolean async); SLresult (*GetState) (SLObjectItf self,SLuint32 * pState); SLresult (*GetInterface) (SLObjectItf self, const SLInterfaceID iid, void * pInterface); SLresult (*RegisterCallback) (SLObjectItf self, slObjectCallback callback, void * pContext); void (*AbortAsyncOperation) (SLObjectItf self); void (*Destroy) (SLObjectItf self); SLresult (*SetPriority) (SLObjectItf self, SLint32 priority, SLboolean preemptable); SLresult (*GetPriority) (SLObjectItf self, SLint32 *pPriority, SLboolean *pPreemptable); SLresult (*SetLossOfControlInterfaces) (SLObjectItf self, SLint16 numInterfaces, SLInterfaceID * pInterfaceIDs, SLboolean enabled); }; typedef const struct SLObjectItf_ * const * SLObjectItf;Copy the code
Here SLObjectItf is a second level pointer, it points to a structure pointer. Any created Object must be initialized by calling the Realize method, and the Destroy method can be used to release resources when they are not needed.
Now we can use engineEngine to create all the other objects in OpenSL ES.
3.1.2 Media Object
Another group of more important objects in OpenSL ES is Media Object, which represents the abstraction of multimedia functions, such as player, Recorder and so on.
We can create an instance of a Player object using the CreateAudioPlayer method provided by SLEngineItf. A Recorder instance can be created using the CreateAudioRecorder method provided by SLEngineItf.
3.1.3 Data Source and Data Sink
In OpenSL ES, these two structures exist as parameters when creating Media Object objects. Data source represents the information of input source, that is, where data comes from and what data parameters are input. The data sink represents the output information, that is, where the data is output and with what parameters.
A Data Source is defined as follows:
typedef struct SLDataSource_ { void *pLocator; void*pFormat; } SLDataSource;1.23.4..
Copy the code
Data Sink is defined as follows:
typedef struct SLDataSink_ { void *pLocator; void*pFormat; } SLDataSink;1.23.4..
Copy the code
Plocators mainly include the following types:
SLDataLocator_AddressSLDataLocator_BufferQueueSLDataLocator_IODeviceSLDataLocator_MIDIBufferQueueSLDataLocator_URI12.3.4.. 5.
Copy the code
In other words, the input/output source of a Media Object can be a URL, a Device, or a buffer queue, etc., depending on the specific type and application scenario of the Media Object.
3.2 Status Mechanism
Another important concept of OpenSL ES is its state mechanism, as shown in the figure below:
After an OpenSL ES object is created, it enters the SL_OBJECT_STATE_UNREALIZED state. In this state, the system does not allocate any resources to it until the Realize function is called.
After realizing, objects can enter the SL_OBJECT_STATE_REALIZED state, which is a “available” state. Only in this state can the functions and resources of objects be accessed properly.
When some system event occurs, such as an error or the Audio device is preempted by another application, the OpenSL ES object will enter the SL_OBJECT_STATE_SUSPENDED state. If you want to restore normal use, you need to call the Resume function.
When the object’s Destroy function is called, the resource is released and the state is returned to SL_OBJECT_STATE_UNREALIZED.
In short, the life cycle of an OpenSL ES object, the process from Create to Destroy, is controlled through developer display calls.
3.3 API Call Process Summary
Player:
Recorder:
For more details on the API, please refer to openSL_ES_specification_1.0.1. PDF. See github.com/qingkouwei/ in oarPlayer for the playback logic… Audio output code.
4. AAudio & Oboe
AAudio is a new Android C API introduced in the Android O release. This API is designed for high-performance audio applications that require low latency. Applications communicate with AAudio by reading data and writing it to the stream.
The AAudio API has a minimal design and does not perform the following functions:
- Audio equipment enumeration
- Automated routing between audio endpoints
- File IO
- Decode compressed audio
- Automatically submit all inputs/streams in a single callback.
Each stream is connected to a single audio device. An audio device is a hardware interface or virtual endpoint that serves as a source or receiver for a continuous stream of digital audio data. Don’t confuse an audio device (with a built-in microphone or Bluetooth headset) with an Android device (phone or smartwatch) running an app. You can use the AudioManager method getDevices() to discover audio devices available on Android devices. This method returns information about each device type. Every audio device on an Android device has a unique ID. You can use this ID to bind an audio stream to a specific audio device. However, in most cases, you can let AAudio choose the default primary device without specifying it yourself. The audio device receiving the stream is responsible for determining whether the stream is for input or output. A stream can only move data in one direction. When you define a flow, you can also set its direction. When you open the stream, Android checks to make sure the audio device is in the same direction as the stream.
4.1 API specification
4.1.1 AAudio Create audio stream flow
The AAudio library uses the builder design pattern and provides AAudioStreamBuilder.
-
Create AAudioStreamBuilder:
AAudioStreamBuilder *builder; aaudio_result_t result = AAudio_createStreamBuilder(&builder);Copy the code
-
Set the audio stream configuration in the builder using the builder function corresponding to the stream parameter. The optional setup functions available are as follows:
AAudioStreamBuilder_setDeviceId(builder, deviceId); AAudioStreamBuilder_setDirection(builder, direction); AAudioStreamBuilder_setSharingMode(builder, mode); AAudioStreamBuilder_setSampleRate(builder, sampleRate); AAudioStreamBuilder_setChannelCount(builder, channelCount); AAudioStreamBuilder_setFormat(builder, format); AAudioStreamBuilder_setBufferCapacityInFrames(builder, frames);Copy the code
Note: These methods do not report errors such as undefined constants or values out of range.
If deviceId is not specified, it defaults to the primary output device. If you do not specify a stream direction, the output stream will default. For all other parameters, you can specify values explicitly, leave them unspecified, or set them to AAUDIO_UNSPECIFIED, allowing the system to assign the best values. To be safe, you need to check the status of the audio stream after it is created.
-
After configuring AAudioStreamBuilder, use it to create streams:
AAudioStream *stream; result = AAudioStreamBuilder_openStream(builder, &stream);Copy the code
-
After the flow is created, verify its configuration. If the sampling format, sampling rate, or number of samples per frame are specified, these Settings do not change. If a sharing mode or buffer capacity is specified, these Settings may change, depending on the audio device capabilities of the stream and the Android device running the stream. As a good defensive programming practice, you should check the configuration of the flow before using it. You can retrieve the flow Settings corresponding to each builder setting using the corresponding functions:
AAudioStreamBuilder_setDeviceId()
AAudioStream_getDeviceId()
AAudioStreamBuilder_setDirection()
AAudioStream_getDirection()
AAudioStreamBuilder_setSharingMode()
AAudioStream_getSharingMode()
AAudioStreamBuilder_setSampleRate()
AAudioStream_getSampleRate()
AAudioStreamBuilder_setChannelCount()
AAudioStream_getChannelCount()
AAudioStreamBuilder_setFormat()
AAudioStream_getFormat()
AAudioStreamBuilder_setBufferCapacityInFrames()
AAudioStream_getBufferCapacityInFrames()
-
You can save the builder so you can use it to create more streams in the future. However, if you no longer plan to use the builder, you should remove it.
AAudioStreamBuilder_delete(builder); Copy the code
4.1.2 AAudio Uses audio streams
Data flows through a flow only when the flow is in the “started” state. Request a state transition using the following functions:
aaudio_result_t result; result = AAudioStream_requestStart(stream); result = AAudioStream_requestStop(stream); result = AAudioStream_requestPause(stream); result = AAudioStream_requestFlush(stream);Copy the code
4.1.3 AAudio Reads and writes audio streams
After a stream is started, data in the stream can be processed in two ways:
- Use high-priority callbacks.
- Use the functions AAudioStream_read(stream, buffer, numFrames, timeoutNanos) and AAudioStream_write(stream, buffer, numFrames, TimeoutNanos) to read or write to the stream.
For blocking read or write operations that transmit a specified number of frames, set timeoutNanos to greater than zero. For non-blocking calls, set timeoutNanos to zero. In this case, the result will be the actual number of frames transmitted.
When reading the input values, you should verify that the correct number of frames have been read. If the correct number of frames are not read, the buffer may contain unknown data, causing audio interference. You can fill a buffer with zeros to create a mute effect:
aaudio_result_t result = AAudioStream_read(stream, audioData, numFrames, timeout); if (result < 0) { // Error! }if (result ! = numFrames) { // pad the buffer with zeros memset(static_cast<sample_type*>(audioData) + result * samplesPerFrame, 0, sizeof(sample_type) * (numFrames - result) * samplesPerFrame); }Copy the code
4.1.4 AAudio Closes the audio stream
When you are done with the stream, close it:
AAudioStream_close(stream);
Copy the code
Once the stream is closed, it cannot be used with any stream-based AAudio functions.
4.1.5 AAudio Disconnected audio streams
The audio stream may be disconnected at any time if any of the following events occur:
- The associated audio device is no longer connected (for example, when the head is pulled out and the headset is worn).
- An internal error occurred.
- An audio device is no longer the primary audio device.
When a stream is disconnected, its status is disconnected, and all attempts to execute AAudioStream_write or other functions return an error. Regardless of the error code, you must always stop and close the disconnected stream.
If you use a data callback (rather than a direct read/write method), you will not receive any return code when the stream disconnects. To receive notifications when this happens, write the AAudioStream_errorCallback function and then register it with AAudioStreamBuilder_setErrorCallback().
If you receive notification in the error callback thread that the connection is down, stopping and closing the flow must be done from another thread. Otherwise a deadlock may occur.
Note that if you open a new stream, its Settings may be different from the original stream (for example, framesPerBurst) :
void errorCallback(AAudioStream *stream, void *userData, aaudio_result_t error) { // Launch a new thread to handle the disconnect. std::thread myThread(my_error_thread_proc, stream, userData); myThread.detach(); // Don't wait for the thread to finish.}
Copy the code
4.2 State Transition
AAudio streams generally have the following five stable states (error states will be described at the end of this section) :
- Open the
- Have begun to
- Has suspended
- Has been refreshed
- Has stopped
State-switching functions are asynchronous functions, so the state does not change immediately. When you request a state change, the flow enters the following corresponding transition state:
- Is beginning to
- Is suspended
- Is refreshing
- Is to stop
- Is closing
The following state diagram shows the steady state as a rounded rectangle and the transition state as a dashed rectangle. Although not shown, you can call close() from any state
AAudio does not provide a callback function to alert you to a status change. You can use special functions AAudioStream_waitForStateChange(STREAM, inputState, nextState, timeout) to wait for state changes.
This function itself does not detect state changes and does not wait for a specific state, but for the current state to deviate from your specified inputState.
For example, after a pause is requested, the flow should enter the “paused” transition state immediately and the “paused” state at a later point, but there is no guarantee that this will happen. Because you cannot wait for the “paused” state, use waitForStateChange() to wait for any state other than “paused.” The method is as follows:
aaudio_stream_state_t inputState = AAUDIO_STREAM_STATE_PAUSING; aaudio_stream_state_t nextState = AAUDIO_STREAM_STATE_UNINITIALIZED; int64_t timeoutNanos = 100 * AAUDIO_NANOS_PER_MILLISECOND; result = AAudioStream_requestPause(stream); result = AAudioStream_waitForStateChange(stream, inputState, &nextState, timeoutNanos);Copy the code
If the state of the flow is not “paused” (that is, inputState, which we assume is the state when the call is currently executed), this function returns immediately. Otherwise, the function stops running until the state is no longer “paused,” or it times out. When the function returns, the nextState argument shows the current state of the stream.
You can use this method after a call to start, stop, or refresh a request, using the corresponding transition state as inputState. Do not call waitForStateChange() after AAudioStream_close(), as the stream is immediately deleted when closed. Also, do not call AAudioStream_close() while another thread is running waitForStateChange().
4.3 Optimizing Performance
We can optimize the performance of audio applications by tweaking the internal buffers and using special high-priority threads.
4.3.1 Adjust buffers to minimize latency
AAudio passes data into and out of an internal buffer it maintains (one for each audio device).
Note: Do not confuse the internal buffer of AAudio with the buffer parameters of the AAudio stream reading or writing functions.
The size of a buffer is the amount of data that can be stored in the buffer. We can call AAudioStreamBuilder_setBufferCapacityInFrames () to set the capacity. This method limits the amount of capacity that can be allocated to the maximum allowed by the device. Use the AAudioStream_getBufferCapacityInFrames () can verify the actual capacity of buffer.
The application does not have to use the full capacity of the buffer. You can set an upper limit on the size of the AAudio fill buffer space. The buffer space size must not exceed its capacity, and is usually smaller than its capacity. The delay time can be controlled by controlling the buffer size to determine the number of pulses required to fill the buffer. Buffer sizes can be handled using the AAudioStreamBuilder_setBufferSizeInFrames() and AAudioStreamBuilder_getBufferSizeInFrames() methods.
When an application plays audio, it writes data to a buffer and blocks it until the writing is complete. AAudio reads data from the buffer in discrete pulses. Each pulse train contains multiple audio frames and is usually smaller than the buffer size being read. The size and rate of the pulse train are controlled by the system, and these properties are usually specified by the circuit of the audio device. Although you cannot change the size or rate of the pulse train, you can set the internal buffer size based on the number of pulse trains contained in the internal buffer. In general, the latency is minimal when the buffer size of the AAudioStream is a multiple of the size of the reported pulse string.
One way to optimize the size of the buffer space is to start with large buffers, gradually reduce them until they start to run out of buffers, and then increase them slightly. Alternatively, you can start with a small buffer space size and, if you run out of buffers, increase the buffer space size until the output is flowing again. This process moves quickly and is likely to be completed before the user starts playing the first audio. Initial buffer resizing can be done with mute to ensure that the user does not hear any audio interference. System performance may change over time (for example, users may turn off airplane mode). Because the overhead of buffer adjustment is very small, an application can continuously adjust the buffer while reading or writing data to and from the stream.
Here is an example of a buffer optimization loop:
int32_t previousUnderrunCount = 0;int32_t framesPerBurst = AAudioStream_getFramesPerBurst(stream);int32_t bufferSize = AAudioStream_getBufferSizeInFrames(stream);int32_t bufferCapacity = AAudioStream_getBufferCapacityInFrames(stream);while (go) { result = writeSomeData(a);if (result < 0) break; // Are we getting underruns? if (bufferSize < bufferCapacity) { int32_t underrunCount = AAudioStream_getXRunCount(stream); if (underrunCount > previousUnderrunCount) { previousUnderrunCount = underrunCount; // Try increasing the buffer size by one burst bufferSize += framesPerBurst; bufferSize = AAudioStream_setBufferSize(stream, bufferSize); }}}
Copy the code
Optimizing buffer sizes in this way is not beneficial for input streams. The input stream runs as fast as possible to try to keep the amount of cached data to a minimum, and then fills the buffer when the application is preempted.
4.3.2 Using high-priority callbacks
If the application reads or writes audio data from the original thread, it may be preempted or experience timing jitter, which may cause audio interference. Using a larger buffer helps avoid such interference, but if the buffer is larger, the audio latency is longer. For applications with short latency requirements, audio streams can use an asynchronous callback function to transfer data to and from the application. AAudio performs the callback in a higher priority thread, which helps improve performance.
The prototype of this callback function looks like this:
typedef aaudio_data_callback_result_t (*AAudioStream_dataCallback)( AAudioStream *stream, void *userData, void *audioData, int32_t numFrames);
Copy the code
Use a stream build to register callbacks:
AAudioStreamBuilder_setDataCallback(builder, myCallback, myUserData);
Copy the code
In the simplest case, the flow periodically executes this callback function to get data for the next pulse train.
The callback function should not perform read or write operations on the stream calling it. If the callback belongs to an input stream, your code should process the data provided in the audioData buffer (specified as the third parameter). If the callback belongs to an output stream, your code should put data into that buffer.
For example, a callback can be used to continuously generate sinusoidal output, as follows:
aaudio_data_callback_result_t myCallback( AAudioStream *stream, void *userData, void *audioData, int32_t numFrames) { int64_t timeout = 0; // Write samples directly into the audioData array. generateSineWave(static_cast
(audioData), numFrames); return AAUDIO_CALLABCK_RESULT_CONTINUE; }
Copy the code
Multiple streams can be processed using AAudio. One of the streams can be used as the main stream and Pointers to the other streams can be passed in user data. Callback for mainstream registration. Then, use non-blocking I/O for the other streams. The following is an example of a callback that passes an input stream to an output stream. The main calling stream is the output stream. The input stream is included in the user data.
This callback performs a non-blocking read from the input stream to put data into the buffer of the output stream:
aaudio_data_callback_result_t myCallback( AAudioStream *stream, void *userData, void *audioData, int32_t numFrames) { AAudioStream *inputStream = (AAudioStream *) userData; int64_t timeout = 0; aaudio_result_t result = AAudioStream_read(inputStream, audioData, numFrames, timeout); if (result == numFrames) return AAUDIO_CALLABCK_RESULT_CONTINUE; if (result >= 0) { memset(static_cast<sample_type*>(audioData) + result * samplesPerFrame, 0.sizeof(sample_type) * (numFrames - result) * samplesPerFrame); return AAUDIO_CALLBACK_RESULT_CONTINUE; } returnAAUDIO_CALLBACK_RESULT_STOP; }Copy the code
Note that in this example, the input and output streams are assumed to have the same number of channels, format, and sampling rate. The format of the stream can be mismatched, as long as the code handles the transformation correctly.
4.3.3 Setting the Performance Mode
Each AAudioStream has a performance mode, which has a significant impact on application behavior. There are three modes:
AAUDIO_PERFORMANCE_MODE_NONE
Is the default mode. This mode uses a basic flow that balances delay time with energy saving.AAUDIO_PERFORMANCE_MODE_LOW_LATENCY
Use smaller buffers and optimized data paths to reduce latency.AAUDIO_PERFORMANCE_MODE_POWER_SAVING
Use large internal buffers and data paths that trade latency for energy savings.
You can select a performance pattern by calling setPerformanceMode() and discover the current pattern by calling getPerformanceMode().
If reducing latency is more important than saving energy in your application, use AAUDIO_PERFORMANCE_MODE_LOW_LATENCY. This is useful for highly interactive applications, such as games or keyboard synthesizers.
If saving energy is more important than reducing latency in your application, use AAUDIO_PERFORMANCE_MODE_POWER_SAVING. This is often the case for applications that play back previously generated music, such as streaming audio or MIDI file players.
In the current version of AAudio, to minimize latency, the AAUDIO_PERFORMANCE_MODE_LOW_LATENCY performance mode must be used in conjunction with high-priority callbacks. See the following example:
// Create a stream builderAAudioStreamBuilder *streamBuilder; AAudio_createStreamBuilder(&streamBuilder); AAudioStreamBuilder_setDataCallback(streamBuilder, dataCallback, nullptr); AAudioStreamBuilder_setPerformanceMode(streamBuilder, AAUDIO_PERFORMANCE_MODE_LOW_LATENCY); // Use it to create the streamAAudioStream *stream; AAudioStreamBuilder_openStream(streamBuilder, &stream);
Copy the code
The official document: developer. The android. Google. Cn/the NDK/guides /…
4.4 Oboe
Google has opened source the Oboe library, a C++ wrapper that provides a very similar API to AAudio. It calls AAudio when it is available and falls back to using OpenSL ES when AAudio is not available.
Take recording as an example to achieve the program:
class Callback:public oboe::AudioStreamCallback{DataCallbackResultOboeRecorder::onAudioReady(AudioStream *oboeStream, void *audioData, int32_t numFrames) { uint32_t size = 2 * numFrames * channels; //handle audio data return DataCallbackResult::Continue; }}oboe::AudioStreamBuilder builder; builder->setAudioApi(mAudioApi) ->setFormat(mFormat) ->setSharingMode(oboe::SharingMode::Exclusive) ->setPerformanceMode(oboe::PerformanceMode::LowLatency) ->setDeviceId(mRecordingDeviceId) ->setFramesPerCallback(framesPerBuffer) ->setDirection(oboe::Direction::Input) ->setDataCallback(new Callback()) ->setSampleRate(samplerate) ->setChannelCount(channels); oboe::AudioStream *mRecordingStream = nullptr; oboe::Result result = builder.openStream(&mRecordingStream); if (result == oboe::Result::OK && mRecordingStream) { mRealSampleRate = mRecordingStream->getSampleRate(); mFormat = mRecordingStream->getFormat(); oboe::Result result = stream->requestStart(); mRecordingStream->waitForStateChange(StreamState::Starting, &state, 10*kNanosPerMillisecond); if (result ! = oboe::Result::OK) { Logg("Error starting stream. %s", oboe::convertToText(result)); }} else { Logg("Failed to create recording stream. Error: %s",oboe::convertToText(result)); result = mRecordingStream->close(); }Copy the code
Oboe provides many creative references, which can be viewed directly at the Oboe Github repository. Like this drum kit:
www.bilibili.com/video/BV12q…
5. To summarize
The Java layer can use AudioRecorder/AudioTrack, and the JNI layer recommends using Oboe. Either API ultimately deals with hardware through the Framework layer, HAL layer. There is an “AudioRecord” thread in the application layer that interacts with the framework’s services. How do we record if we want to bypass the system API? Next introduction.