Overview

Apple uses Audio Sessions to manage behavior between apps, apps and other apps, and apps and external audio hardware. Audio sessions communicate to the system how you’re going to use audio. Audio sessions act as an intermediary between your app and the system. This allows us to control the behavior of the hardware without knowing about it.

  • Configure the Audio Session category and mode to tell the system how you want to use audio in your app
  • Enable Audio Session to make configured categories and schemas work
  • Add notifications in response to important Audio session notifications, such as audio interrupts and hardware line changes
  • Configure the audio sampling rate and number of channels

1. Configure Audio Session

1.1. Audio Session Management Audio

The Audio session is an intermediary between the application and the system to configure the audio behavior. When the APP starts, it automatically gets an Audio session singleton, configures and activates it to make the audio work as expected.

1.2. Categories represent Audio

The Audio Session category represents the primary behavior of the audio. By setting the category, you can indicate whether the app is using the current input or output audio device, whether their audio is forced to stop when other apps are playing audio in our app, or whether it is playing with our audio, etc.

There are many Audio session categories defined in AVFoundation. You can customize audio behaviors according to your needs. Many categories support playing, recording, recording and playing at the same time. The system will also ensure that audio from other apps works in a way that suits your app.

Some categories can be further customized according to Mode, which is used to specify specific categories of behavior. For example, when using video recording Mode, the system may choose a microphone that is different from the default built-in microphone, and the signal strength of the microphone can be adjusted for recording.

1.3. Interrupt processing

If audio breaks unexpectedly, the system deactivates the Aduio session and audio stops immediately. An interrupt occurs when another app’s Audio Session is activated and its category is not set to mix with the system category or your app’s category. When your application receives an interruption notification, it should save the status, update the user interface, and other related actions. By registering AVAudioSessionInterruptionNotification can observe interrupts the start and end points.

1.4. Audio line changes

When a user to connect and disconnect the audio input and output devices, (such as: plug headphones) audio line change, by registering AVAudioSessionRouteChangeNotification can in audio circuit changes accordingly.

1.5. Audio Sessions control device configuration

The App cannot directly control the hardware of the device, but the Audio Session provides some interfaces to obtain or set some advanced audio Settings, such as sampling rate, number of tracks, etc.

1.6. Audio Sessions protect user privacy

If the App wants to use the audio recording function, it must request user authorization; otherwise, it cannot be used.

2. Activate Audio Session

After setting the category, Options, and Mode of the Audio session, we can activate it to start the audio.

2.1. How does the system solve audio competition

With the launch of the app, some built-in services (SMS, music, browser, phone, etc.) will also run in the background. The first of these built-in services may generate audio, such as a phone call, a message prompt and so on…

2.2. Enable and disable Audio Session

While playback and recording in AVFoundation can automatically activate your Audio session, you can manually activate it and test for success.

The system will disable your Audio session when a phone call comes in, an alarm goes off, or a calendar alert comes in. After processing the intervening messages, the system allows us to manually reactivate audio Sesseion.

let session = AVAudioSession.sharedInstance()
do {
    // 1) Configure your audio session category, options, and mode
    // 2) Activate your audio session to enable your custom configuration
    try session.setActive(true)
} catch let error as NSError {
    print("Unable to activate audio session: \(error.localizedDescription)")}Copy the code

If we use AVFoundation objects (AVPlayer, AVAudioRecorder, etc.), the system is responsible for reactivating the Audio Session when the interruption ends. However, if you sign up for notifications to reactivate the Audio Session, you can verify that it was successfully activated and update the user interface.

  • Ensure that the audio session of the VoIP application running in the background is active only when the application processes the call. In the background, the audio session of the VoIP application should not be active if no call is received.
  • Ensure that the audio session of an application that uses the recording category is active only at the time of recording. Before recording starts and stops, make sure your session is deactivated to allow other sounds, such as system sounds, to play.
  • If the application supports background audio playback or recording, deactivate its audio session when entering the background when the application is not actively using audio (or preparing to use audio). Doing so allows the system to free up audio resources so that other processes can use them.

2.3. Check whether other Audio is playing

When your app is activated, the current device may be playing other sounds. If your app is a game app, it is important to know the source of other sounds, as many games allow other music to be played simultaneously to enhance the user experience.

In the app in the top front, we can through applicationDidBecomeActive: proxy method in which use secondaryAudioShouldBeSilencedHint attribute to determine whether the audio is playing. This value is true if the audio session being played by another app has an unmixable configuration. The app can use this property to eliminate secondary audio.

func setupNotifications() {
    NotificationCenter.default.addObserver(self,
                                           selector: #selector(handleSecondaryAudio),
                                           name: .AVAudioSessionSilenceSecondaryAudioHint,
                                           object: AVAudioSession.sharedInstance())
}
 
func handleSecondaryAudio(notification: Notification) {
    // Determine hint type
    guard let userInfo = notification.userInfo,
        let typeValue = userInfo[AVAudioSessionSilenceSecondaryAudioHintTypeKey] as? UInt,
        let type = AVAudioSessionSilenceSecondaryAudioHintType(rawValue: typeValue) else {
            return
    }
 
    if type == .begin {
        // Other app audio started playing - mute secondary audio
    } else {
        // Other app audio stopped playing - restart secondary audio
    }
}
Copy the code

3. The response is interrupted

You can respond to app interrupts with code. Audio interruption will cause the Audio Session to stop and the audio in the application to terminate immediately. An interrupt occurs when a competing Audio session from another app is activated and the audio Session category does not support mixing with your app. Register the notification so that we can act when we learn of the audio interruption.

The App will be suspended due to interruption. When the user receives a call, alarm clock or other system events are triggered, the App will continue to run after the interruption, but we need to manually reactivate audio Session.

3.1. Interruption lifecycle

The following figure simply shows the changes in the active and inactive states between the Audio Session of app and the audio Session of the system after receiving FaceTime.

3.2. Interrupt handling methods

By registering to listen for notifications of interrupts, you can handle them as they come in. Handling interrupts depends on what you are currently doing: playing, recording, converting audio formats, reading audio packets, and so on. In general, we should try to avoid interruptions and recover from them as soon as possible.

Before the interrupt

  • Save state and context
  • Updating the User Interface

After the interruption

  • Restores state and context
  • Updating the User Interface
  • Reactivate the Audio Session.
Audio technology How interruptions work
AVFoundation framework The system automatically suspends recording and playing when the interruption ends. When the interruption ends, the audio Session is activated again to resume recording and playing
Audio Queue Services, I/O audio unit An interrupt notification is sent, and the developer can save the playback and recording status and reactivate the Audio Session when the interrupt ends
System Sound Services Use the system sound service to mute the interrupt while it is coming, and play the sound automatically if the interrupt ends.

3.3. Deal with Siri

When dealing with Siri, unlike other interrupts, we need to monitor Siri during the interruption. For example, during the interruption, the user asks Siri to pause the audio playback in the developer app. When the app receives the notification of the end of the interruption, it should not automatically resume the playback. Meanwhile, the user interface needs to be consistent with Siri’s requirements.

3.4. Listening interruption

Registered AVAudioSessionInterruptionNotification advice can monitor interrupts.

func registerForNotifications() {
    NotificationCenter.default.addObserver(self,
                                           selector: #selector(handleInterruption),
                                           name: .AVAudioSessionInterruption,
                                           object: AVAudioSession.sharedInstance())
}
 
func handleInterruption(_ notification: Notification) {
    // Handle interruption
}

func handleInterruption(_ notification: Notification) {
    guard let info = notification.userInfo,
        let typeValue = info[AVAudioSessionInterruptionTypeKey] as? UInt,
        let type = AVAudioSessionInterruptionType(rawValue: typeValue) else {
            return
    }
    if type == .began {
        // Interruption began, take appropriate actions (save state, update user interface)
    }
    else if type == .ended {
        guard let optionsValue =
            userInfo[AVAudioSessionInterruptionOptionKey] as? UInt else {
                return
        }
        let options = AVAudioSessionInterruptionOptions(rawValue: optionsValue)
        if options.contains(.shouldResume) {
            // Interruption Ended - playback should resume
        }
    }
}

Copy the code

Note: There is no guarantee that there will be an end interrupt after the start interrupt, so if there is no end interrupt, we always need to check if the Aduio Session is activated when the app replays the audio.

3.5. Respond to the media server reset operation

Media Server provides audio and other multimedia capabilities through a shared server process. Although rare, but if in your app is running to receive a reset command, can AVAudioSessionMediaServicesWereResetNotification through registration notice to monitor whether the media server reset. After receiving the notification, perform the following operations.

  • Destruction of audio object and create a new audio objects (such as: players, recorders, converters, audio the queues)
  • Reset all audio states, including all AVAudioSession properties
  • Reactivate the AVAudioSession object when appropriate.

Registered AVAudioSessionMediaServicesWereLostNotification could receive notification on the media server is unavailable.

If a developer needs a reset feature in their application, such as a reset option in the Settings, it can be easily reset using this method.

4. The line changes

Audio Hardware Route The audio hardware line of the specified device changes. When the user plugs and unplugs the headset, the system automatically changes the hardware wiring. Developers can register AVAudioSessionRouteChangeNotification inform adjust accordingly when line change.

As shown above, the system in the app when they start to determine a set of audio line, and then during the period of the program is running will continue to monitor the current active audio line, during the recording, the user can plug headphones, the system will send a change line notice tell developers audio stop at the same time, the developer can code by deciding whether or not to activate.

Playing is slightly different from recording. When playing, if the user unplugs the headset, the audio will be suspended by default. If the user plugs in the headset, the audio will continue by default.

4.1. Monitor Audio line changes

why

  • Plug the headphones
  • Connect and disconnect the Bluetooth headset
  • Plug and unplug USB audio devices
func setupNotifications() {
    NotificationCenter.default.addObserver(self,
                                           selector: #selector(handleRouteChange),
                                           name: .AVAudioSessionRouteChange,
                                           object: AVAudioSession.sharedInstance())
}
 
func handleRouteChange(_ notification: Notification) {
 
}
Copy the code

Detailed information about line changes is provided in userInfo. Can change the reason through the dictionary query AVAudioSessionRouteChangeReason, such as when the new device access, the reason for AVAudioSessionRouteChangeReason, removed for AVAudioSessionRouteChange ReasonOldDeviceUnavailable

func handleRouteChange(_ notification: Notification) {
    guard let userInfo = notification.userInfo,
        let reasonValue = userInfo[AVAudioSessionRouteChangeReasonKey] as? UInt,
        let reason = AVAudioSessionRouteChangeReason(rawValue:reasonValue) else {
            return
    }
    switch reason {
    case .newDeviceAvailable:
        // Handle new device available.
    case .oldDeviceUnavailable:
        // Handle old device removed.
    default: ()
    }
}

Copy the code

When audio hardware is inserted, you can query the Audio Session’s currentRoute property to determine the location of the current audio output. It will return a AVAudioSessionRouteDescription object contains audio session all the input and output information. When an audio hardware is removed, we can also query the previous circuit from this object. In the above two cases, we can query outputs properties, by returning AVAudioSessionPortDescription object provides the audio output of all the information.

func handleRouteChange(notification: NSNotification) {
    guard let userInfo = notification.userInfo,
        let reasonValue = userInfo[AVAudioSessionRouteChangeReasonKey] as? UInt,
        let reason = AVAudioSessionRouteChangeReason(rawValue:reasonValue) else {
            return
    }
    switch reason {
    case .newDeviceAvailable:
        let session = AVAudioSession.sharedInstance()
        for output in session.currentRoute.outputs where output.portType == AVAudioSessionPortHeadphones {
            headphonesConnected = true
        }
    case .oldDeviceUnavailable:
        if let previousRoute =
            userInfo[AVAudioSessionRouteChangePreviousRouteKey] as? AVAudioSessionRouteDescription {
            for output in previousRoute.outputs where output.portType == AVAudioSessionPortHeadphones {
                headphonesConnected = false
            }
        }
    default: ()
    }
}
Copy the code

5. Configure the hardware

Using the Audio Session property, you can optimize hardware audio behavior at run time. This allows the code to adapt to the characteristics of the running device. The same goes for user changes to audio hardware.

5.1. Set initial audio parameters

Use audio Session to specify audio device Settings such as sampling rate and I/O buffer time.

Setting Preferred sample rate Preferred I/O buffer duration
High value Example: 48 kHz, + High audio quality, — Large file or buffer size Example: 500 mS, + less-file access, and – Longer latency
Low value Example: 8 kHz, + Small file or buffer size, — Low audio quality Example: 5 mS,+ Low latency, – Frequent file access

Note: The default AUDIO input/output buffer duration (I/O buffer duration) provides sufficient response time for most applications. For example, 44.1khz audio is approximately 20ms per response. You can set a lower latency but the corresponding amount of data will also be reduced each time.

5.2. Set

The audio Session must be set up before it can be activated. If you’re running Audio Session, disable it first, then change the Settings to reactivate it.

let session = AVAudioSession.sharedInstance()
 
// Configure category and mode
do {
    try session.setCategory(AVAudioSessionCategoryRecord, mode: AVAudioSessionModeDefault)
} catch let error as NSError {
    print("Unable to set category: \(error.localizedDescription)")
}
 
// Set preferred sample rate
do {
    try session.setPreferredSampleRate(44_100)
} catch let error as NSError {
    print("Unable to set preferred sample rate: \(error.localizedDescription)")
}
 
// Set preferred I/O buffer duration
do} {the try session. SetPreferredIOBufferDuration (0.005) the catchlet error as NSError {
    print("Unable to set preferred I/O buffer duration: \(error.localizedDescription)")
}
 
// Activate the audio session
do {
    try session.setActive(true)
} catch let error as NSError {
    print("Unable to activate session. \(error.localizedDescription)")
}
 
// Query the audio session's ioBufferDuration and sampleRate properties
// to determine if the preferred values were set
print("Audio Session ioBufferDuration: \(session.ioBufferDuration), sampleRate: \(session.sampleRate)")
Copy the code

5.3. Select and configure the microphone

A device may have multiple microphones (built-in, external), and iOS automatically selects one based on the audio Session mode currently in use, which specifies the input digital signal processing (DSP) and possible lines. The input lines are optimized for each mode use case, and setting mode may also affect the audio lines being used.

Developers can manually select the microphone and even set polar pattern if the hardware supports it.

Before using any audio device, set the audio session category and mode for your application, and then activate the audio session.

  • Set the Preferred Input

To find the audio input device to which the current device is connected, use Audio Session availableInputs attribute, this property returns a AVAudioSessionPortDescription object array, describes the current available input devices port, port with the portType for identification. You can use setPreferredInput:error: to set an available audio input device.

  • Set the Preferred Data Source

Some ports, such as built-in microphone and USB, support data sources. Applications can find available data sources by querying the port’s dataSources property. For built-in microphones, the returned data source description object represents each individual microphone. Different devices return different values for the built-in microphone. For example, the iPhone 4 and iPhone 4S have two microphones: bottom and top. The iPhone 5 has three microphones: bottom, front and back.

Each built-in microphone can be identified by a combination of location attributes (up, down) and orientation attributes (front, back, etc.) described by the data source. An application can use AVAudioSessionPortDescription object setPreferredDataSource: error: method to set the preferred data source.

  • Set the Preferred Polar Pattern

Some iOS devices support microphone polarity mode configuration for certain built-in microphones. The polarity mode of the microphone defines its sensitivity to sound relative to the direction of the source. Returns whether the data source supports the pattern using the supportedPolarPatterns property, which returns an array of polar patterns supported by the data source (such as heart or omni-directional), or nil if no alternative pattern is available. If the data source has many supported polar patterns, you can set the preferred polar pattern using the setPreferredPolarPattern: error: method described by the data source.

  • Select a specific microphone and set the Polar Pattern.
// Preferred Mic = Front, Preferred Polar Pattern = Cardioid
let preferredMicOrientation = AVAudioSessionOrientationFront
let preferredPolarPattern = AVAudioSessionPolarPatternCardioid
 
// Retrieve your configured and activated audio session
let session = AVAudioSession.sharedInstance()
 
// Get available inputs
guard let inputs = session.availableInputs else { return }
 
// Find built-in mic
guard let builtInMic = inputs.first(where: {
    $0.portType == AVAudioSessionPortBuiltInMic
}) else { return }
 
// Find the data source at the specified orientation
guard letdataSource = builtInMic.dataSources? .first (where: {
    $0.orientation == preferredMicOrientation
}) else { return }
 
// Set data source's polar pattern do { try dataSource.setPreferredPolarPattern(preferredPolarPattern) } catch let error as NSError { print("Unable to preferred polar pattern: \(error.localizedDescription)") } // Set the data source as the input's preferred data source
do {
    try builtInMic.setPreferredDataSource(dataSource)
} catch let error as NSError {
    print("Unable to preferred dataSource: \(error.localizedDescription)")
}
 
// Set the built-in mic as the preferred input
// This call will be a no-op if already selected
do {
    try session.setPreferredInput(builtInMic)
} catch let error as NSError {
    print("Unable to preferred input: \(error.localizedDescription)")
}
 
// Print Active Configuration
session.currentRoute.inputs.forEach { portDesc in
    print("Port: \(portDesc.portType)")
    if let ds = portDesc.selectedDataSource {
        print("Name: \(ds.dataSourceName)")
        print("Polar Pattern: \(ds.selectedPolarPattern ?? "[none]")")
    }
}
Running this code on an iPhone 6s produces the following console output:

Port: MicrophoneBuiltIn
Name: Front
Polar Pattern: Cardioid

Copy the code

5.4. Simulator operation

You can run your application on an emulator or device. However, the Simulator does not simulate most interactions between audio sessions that are changing between different processes or audio lines. When running the application in the Simulator, you cannot:

  • Call the interrupt
  • Analog plug in or unplug headphones
  • Change the mute switch Settings
  • Analog screen lock
  • Test audio mixing behavior – playing audio along with audio from other applications, such as music applications
#if arch(i386) || arch(x86_64)
    // Execute subset of code that works in the Simulator
#else
    // Execute device-only code as well as the other code
#endif
Copy the code

Protect user privacy

To protect user privacy, apps must ask and obtain permission from users before recording audio. If the user does not grant permission, only mute is recorded. When you use a recording-enabled category and the application tries to use an input line, the user is automatically prompted for permission.

Instead of waiting for the user to be prompted for record permission, you can manually request permission using the requestRecordPermission: method. Using this approach allows your application to gain permissions without interrupting the natural flow of your application, resulting in a better user experience.

AVAudioSession.sharedInstance().requestRecordPermission { granted in
    if granted {
        // User granted access. Present recording interface.
    } else {
        // Present message to user indicating that recording
        // can't be performed until they change their preference // under Settings -> Privacy -> Microphone } }Copy the code

Starting with iOS 10, all apps that access the microphone on any device must statically declare their intent. To this end, the application must now be in the Info. The file contains NSMicrophoneUsageDescription keys, and provide objective string for this key. This string is displayed as part of the alert when the system prompts the user to allow access. If the application attempts to access the microphone of any device without this key and value, the application terminates.


Apple Official Documents