#iOS Audio and Video Communication based on WebRTC summary (2020 Update)

Attached is my SWFIT project, which includes the whole Swift application usage framework, network request framework, the use of DSBridge native and H5 interaction, the use of reflection knowledge, the encapsulation and use of WCDB database, WebRTC audio and video live demo, and the use of socket. Socket protocol encapsulation and use of knowledge points. –> SwFIT project 2020 will continue to be updated and improved

The company wants to use WeBRTC for audio and video communication, referring to many blogs and demos at home and abroad, summarizing the experience: WeBRTC official website WEBRTC on iOS usage instructions

Complete WEBRTC framework, divided into Server side, Client side two major parts.

  • Server side:The Stun server: The server obtains the external network address of the deviceTurn the server: The server is used to relay communications after a point-to-point failureSignaling server: Is responsible for end-to-end connections. At the beginning of connection, both ends need to exchange signaling, such as SDP and candidate, through signaling server.
  • Client has four applications: AndroidiOS PC Broswer

## Introduce WebRTC’s three main apis and the process for implementing point-to-point connections.

  1. MediaStream: The MediaStream API provides synchronous streaming of video and audio through the device’s camera and microphone
  2. RTCPeerConnection: RTCPeerConnection is WebRTC’s component for building stable, efficient flow between points
  3. RTCDataChannel: RTCDataChannel enables browsers (point-to-point) to establish a high-throughput, low-latency channel for transmitting arbitrary data. Among themRTCPeerConnectionIs the core component of our WebRTC.

##WEBRTC connection flow chart

The main process of weBRTC connection is shown in the figure above. The specific process is described as follows:

  1. The client establishes a TCP long connection with the server through socket. WebRTC does not provide corresponding API for this part, so we can use the third-party framework here. OC code suggests using CocoaAsyncSocket third-party framework for socket connection github.com/robbiehanso… Starscream (WebSocket) github.com/daltoniam/S

  2. The client sends an OFFER SDP handshake through the signaling server

The SDP (Session Description Protocol) : SDP is created by PeerConnection in weBRTC framework. For details, please refer to my demo.

3. The client uses the signaling server to initiate the Candidate handshake

Candidate: Mainly contains IP information of related parties, including IP of local area network, public network, TURN server, STUN server, etc. Candidate is created through PeerConnection in weBRTC framework. For details, please refer to my demo.

  1. After the client successfully shakes hands with the Candidate in SDP, it establishes a P2P end-to-end link, and the video stream can be directly transmitted without going through the server.

The SDP handshake process is similar to the Candidate handshake process, but a little complicated. The following is a brief description of the SDP handshake process:

The following figure shows the process of WebRTC establishing an SDP handshake through signaling. The two parties know each other’s information only through the SDP handshake, which is the basis for establishing p2p channels.

1. The Anchor generates THE SDP description through createOffer; 2. The Anchor sets the local description through setLocalDescription; 3. Set the remote description by using setRemoteDescription. 5. The user created his own SDP description with createAnswer 6. You can use setLocalDescription to set the local description 7. The user sends anwser SDP to the anchor 8. The Anchor sets the description information of the remote end through setRemoteDescription. 9. After shaking hands with the SDP, the two ends establish an end-to-end direct communication channel.

Due to the complexity of the network environment we live in, users may be in private Intranet. When using P2P transmission, they will encounter obstacles such as NAT and firewall. At this time, we need to ensure the establishment of P2P links through STUN/TURN/ICE related NAT penetration technology during SDP handshake.

1. Establish a long Socket connection to pave the way for signaling communication.

#### establishes a long connection with the server, selects socket connection, and uses CocoaAsyncSocket as the third-party framework. In fact, WebSocket can also be used, depending on the scheme selection of your team.

  • Github’s CocoaAsyncSocket module is designed to create a socket connection from CocoaAsyncSocket, and WebRTC is designed to create a socket from CocoaAsyncSocket. When transferring data to a server, be aware that you may have a data subcontracting policy.
  • Most of the codes on the Internet use OC, and many have passed and scattered, OC version is relatively simple, the following is shared with swift version, read the following code, please be sure to have a look at the two logical sequence diagrams mentioned above.
// MARK: - SocketClientDelegate: class {func signalClientDidConnect(_ signalClient: SocketClient) func signalClientDidDisconnect(_ signalClient: SocketClient) func signalClient(_ signalClient: SocketClient, didReceiveRemoteSdp sdp: RTCSessionDescription) func signalClient(_ signalClient: SocketClient, didReceiveCandidate candidate: RTCIceCandidate) } final class SocketClient: NSObject { //socket var socket: GCDAsyncSocket = {returnGCDAsyncSocket.init() }() private var host: String? // Private var port: UInt16? // Port weak var delegate: SocketClientDelegate? // Proxy var receiveHeartBeatDuation = 0 // Heartbeat timer countsletHeartBeatOverTime = 10 // Heartbeat timeout var sendHeartbeatTimer:Timer? / / send a heartbeat timer var receiveHeartbearTimer: timer? // Receive heartbeat timer // Receive Data cache var dataBuffer:Data = data.init () // Peer_id var peer_id = 0 // Remote device peer_id var Remote_peer_id = 0 // MARK:- Initialize init(hostStr: String, port: UInt16) { super.init() self.socket.delegate = self self.socket.delegateQueue = DispatchQueue.main self.host = hostStr Self.port = port //socket start connection connect()} // MARK:- Start connection funcconnect() {
        
        do {
            try  self.socket.connect(toHost: self.host ?? "", onPort: self.port ?? 6868, withTimeout: -1)
            
        }catch {
            printFunc sendMessage(_ data: data){self.socket. Write (data, withTimeout: -1, tag: 0)} // MARK:- Send SDP offer/answer func send(SDP rtcSdp: RTCSessionDescription) {// Convert to our SDPlet type = rtcSdp.type
        var typeStr = ""
        switch type {
        case .answer:
            typeStr = "answer"
        case .offer:
            typeStr = "offer"
        default:
            print("SdpType error")}let newSDP:SDPSocket = SDPSocket.init(sdp: rtcSdp.sdp, type: typeStr)
        let jsonInfo = newSDP.toJSON()
        let dic = ["sdp" : jsonInfo]
        let info:SocketInfo = SocketInfo.init(type: .sdp, source: self.peer_id, destination: self.remote_peer_id, params: dic as Dictionary<String, Any>)
        let data = self.packData(info: info)
        //print(data)
        self.sendMessage(data)
        print("Send the SDP"} // MARK:- send iceCandidate func send(candidate rtcIceCandidate: rtcIceCandidate) {let iceCandidateMessage = IceCandidate_Socket(from: rtcIceCandidate)
        let jsonInfo = iceCandidateMessage.toJSON()
        let dic = ["icecandidate" : jsonInfo]
        let info:SocketInfo = SocketInfo.init(type: .icecandidate, source: self.peer_id, destination: self.remote_peer_id, params: dic as Dictionary<String, Any>)
        let data = self.packData(info: info)
        //print(data)
        self.sendMessage(data)
         print("Send ICE"SocketClient: GCDAsyncSocketDelegate {// MARK:- Socket connection successful func socket(_ sock: GCDAsyncSocket, didConnectToHost host: String, port: UInt16) { debugPrint("Socket connection successful"SignalClientDidConnect (self) // Login to obtain the id peer_id Login () // Send the heartbeat startHeartbeatTimer() // Enable the timer for receiving the heartbeat StartReceiveHeartbeatTimer socket () / / continue to receive data. The readData (withTimeout: 1, the tag: Func socket(_ sock: GCDAsyncSocket, didRead Data: data, withTag tag: Int) { //debugPrint("Socket received a packet")
        let _:SocketInfo? = self.unpackData(data)
        //let type:SigType = SigType(rawValue: socketInfo? .type ??"")!
        //print(socketInfo ?? "") / /print(typeReadData (withTimeout: -1, tag: 0)} // MARK:- socketDidDisconnect(_ sock: GCDAsyncSocket, withError err: Error?) { debugPrint("Socket disconnected")
        print(err ?? "")
        
        self.disconnectSocket()
        
        // try to reconnect every two seconds
        DispatchQueue.global().asyncAfter(deadline: .now() + 5) {
            debugPrint("Trying to reconnect to signaling server...")
            self.connect()
        }
    }

}
Copy the code

##2. Carry out signaling communication and establish end-to-end connection.

Import Foundation import WebRTC // MARK: -WebrTC connection status delegate protocol WebRTCClientDelegate: class { func webRTCClient(_ client: WebRTCClient, didDiscoverLocalCandidate candidate: RTCIceCandidate) func webRTCClient(_ client: WebRTCClient, didChangeConnectionState state: RTCIceConnectionState) func webRTCClient(_ client: WebRTCClient, didReceiveData data: Data) } final class WebRTCClient: NSObject {// MARK:- Lazy loading factory private staticlet factory: RTCPeerConnectionFactory = {
        RTCInitializeSSL()
        let videoEncoderFactory = RTCVideoEncoderFactoryH264()
        let videoDecoderFactory = RTCVideoDecoderFactoryH264()
        let factory = RTCPeerConnectionFactory(encoderFactory: videoEncoderFactory, decoderFactory: videoDecoderFactory)

//        let options = RTCPeerConnectionFactoryOptions()
//        options.ignoreVPNNetworkAdapter = true
//        options.ignoreWiFiNetworkAdapter = true
//        options.ignoreCellularNetworkAdapter = true
//        options.ignoreEthernetNetworkAdapter = true
//        options.ignoreLoopbackNetworkAdapter = true
//        factory.setOptions(options)
        return factory
    }()
    
    weak var delegate: WebRTCClientDelegate?
    
    private let peerConnection: RTCPeerConnection
    private let rtcAudioSession =  RTCAudioSession.sharedInstance()
    private let audioQueue = DispatchQueue(label: "audio")
    private let mediaConstrains = [kRTCMediaConstraintsOfferToReceiveAudio: kRTCMediaConstraintsValueTrue,
                                   kRTCMediaConstraintsOfferToReceiveVideo: kRTCMediaConstraintsValueTrue]    
    private var videoCapturer: RTCVideoCapturer?
    private var localVideoTrack: RTCVideoTrack?
    private var remoteVideoTrack: RTCVideoTrack?
    private var localDataChannel: RTCDataChannel?
    private var remoteDataChannel: RTCDataChannel?

    @available(*, unavailable)
    override init() {
        fatalError("WebRTCClient:init is unavailable")
    }
    
    required init(iceServers: [String]) {

//        // gatherContinually will letWebRTC to listen to any network changes and send any new candidates to the other client // config.continualGatheringPolicy = .gatherContinually //config.iceTransportPolicy = .all //contraints: Controls MediaStream content (media type, resolution, frame rate) //let constraints = RTCMediaConstraints(mandatoryConstraints: nil,
//                                              optionalConstraints: ["DtlsSrtpKeyAgreement":kRTCMediaConstraintsValueTrue])
        
        letconfig = RTCConfiguration() config.iceServers = [RTCIceServer(urlStrings: iceServers)] // Unified plan is more superior than planB config.sdpSemantics = .unifiedPlan //contraints: Control MediaStream content (media type, resolution, frame rate)letmediaConstraints = RTCMediaConstraints.init(mandatoryConstraints: nil, optionalConstraints: nil) self.peerConnection = WebRTCClient.factory.peerConnection(with: config, constraints: mediaConstraints, delegate: nil) super.init() self.createMediaSenders() self.configureAudioSession() self.peerConnection.delegate = self } // MARK: Hang funcdisconnect() {self. PeerConnection. Close ()} / / MARK: Signaling for local SDP used to send to the socket server func offer (completion: @ escaping (_ SDP: RTCSessionDescription) -> Void) {let constrains = RTCMediaConstraints(mandatoryConstraints: self.mediaConstrains,
                                             optionalConstraints: nil)
        self.peerConnection.offer(for: constrains) { (sdp, error) in
            guard let sdp = sdp else {
                return
            }
            
            self.peerConnection.setLocalDescription(sdp, completionHandler: { (error) inCompletion (SDP)})}} // MARK:- Reply to sockdet server SDP answer func answer(completion: @escaping (_ SDP: RTCSessionDescription) -> Void) {letconstrains = RTCMediaConstraints(mandatoryConstraints: self.mediaConstrains, optionalConstraints: Nil) / / to get the local SDP self. The peerConnection. Answer (for: constrains) { (sdp, error) in
            guard let sdp = sdp else {
                return} / / set the local SDP self. PeerConnection. SetLocalDescription (SDP, completionHandler: {(error)inSDP completion(SDP)})} // MARK:- Set up the remote SDP funcset(remoteSdp: RTCSessionDescription, completion: @escaping (Error?) -> ()) { self.peerConnection.setRemoteDescription(remoteSdp, completionHandler: Completion)} // MARK:- Add remote candidate funcset(remoteCandidate: RTCIceCandidate) {
        self.peerConnection.add(remoteCandidate)
    }
    
    // MARK: Media
    func startCaptureLocalVideo(renderer: RTCVideoRenderer) {
        guard let capturer = self.videoCapturer as? RTCCameraVideoCapturer else {
            return} guard // Get the front cameralet frontCamera = (RTCCameraVideoCapturer.captureDevices().first { $0.position == .front }),
            // choose highest res
            let format = (RTCCameraVideoCapturer.supportedFormats(for: frontCamera).sorted { (f1, f2) -> Bool in
                let width1 = CMVideoFormatDescriptionGetDimensions(f1.formatDescription).width
                let width2 = CMVideoFormatDescriptionGetDimensions(f2.formatDescription).width
                return width1 < width2
            }).last,
        
            // choose highest fps
            let fps = (format.videoSupportedFrameRateRanges.sorted { return $0.maxFrameRate < The $1.maxFrameRate }.last) else {
            return} capturer.startCapture(with: frontCamera, format: format, fps: Int(fps.maxFrameRate)) self.localVideoTrack? .add(renderer) } func renderRemoteVideo(to renderer: RTCVideoRenderer) { self.remoteVideoTrack? .add(renderer) } private funcconfigureAudioSession() {
        self.rtcAudioSession.lockForConfiguration()
        do {
            try self.rtcAudioSession.setCategory(AVAudioSession.Category.playAndRecord.rawValue)
            try self.rtcAudioSession.setMode(AVAudioSession.Mode.voiceChat.rawValue)
        } catch let error {
            debugPrint("Error changeing AVAudioSession category: \(error)")} self. RtcAudioSession. UnlockForConfiguration ()} / / MARK: - create media stream private funccreateMediaSenders() {
        let streamId = "stream"
        
        // Audio
        let audioTrack = self.createAudioTrack()
        self.peerConnection.add(audioTrack, streamIds: [streamId])
        
        // Video
        let videoTrack = self.createVideoTrack()
        self.localVideoTrack = videoTrack
        self.peerConnection.add(videoTrack, streamIds: [streamId])
        self.remoteVideoTrack = self.peerConnection.transceivers.first { $0.mediaType == .video }? .receiver.track as? RTCVideoTrack //samadd //self.remoteVideoTrack? .source.adaptOutputFormat(toWidth: 960, height: 480, fps: 30) // Dataif let dataChannel = createDataChannelDelegate = self self.localDataChannel = dataChannel}} // MARK:- Create audio track private func createAudioTrack() -> RTCAudioTrack {let audioConstrains = RTCMediaConstraints(mandatoryConstraints: nil, optionalConstraints: nil)
        let audioSource = WebRTCClient.factory.audioSource(with: audioConstrains)
        let audioTrack = WebRTCClient.factory.audioTrack(with: audioSource, trackId: "audio0")
        returnAudioTrack private func createVideoTrack() -> RTCVideoTrack {let videoSource = WebRTCClient.factory.videoSource()
        
        #if TARGET_OS_SIMULATOR
        self.videoCapturer = RTCFileVideoCapturer(delegate: videoSource)
        #else
        self.videoCapturer = RTCCameraVideoCapturer(delegate: videoSource)
        #endif
        
        let videoTrack = WebRTCClient.factory.videoTrack(with: videoSource, trackId: "video0")
        returnVideoTrack} // MARK: Private func createDataChannel() -> RTCDataChannel? {let config = RTCDataChannelConfiguration()
        guard let dataChannel = self.peerConnection.dataChannel(forLabel: "WebRTCData", configuration: config) else {
            debugPrint("Warning: Couldn't create data channel.")
            return nil
        }
        returnDataChannel} // MARK:- sendData func sendData(_ data: data) {let buffer = RTCDataBuffer(data: data, isBinary: true) self.remoteDataChannel? .sendData(buffer) } } // MARK:- Audio control extension WebRTCClient { funcmuteAudio() {
        self.setAudioEnabled(false)
    }
    
    func unmuteAudio() {
        self.setAudioEnabled(true)
    }
    
    // Fallback to the default playing device: headphones/bluetooth/ear speaker
    func speakerOff() {
        self.audioQueue.async { [weak self] in
            guard let self = self else {
                return
            }
            
            self.rtcAudioSession.lockForConfiguration()
            do {
                try self.rtcAudioSession.setCategory(AVAudioSession.Category.playAndRecord.rawValue)
                try self.rtcAudioSession.overrideOutputAudioPort(.none)
            } catch let error {
                debugPrint("Error setting AVAudioSession category: \(error)")
            }
            self.rtcAudioSession.unlockForConfiguration()
        }
    }
    
    // Force speaker
    func speakerOn() {
        self.audioQueue.async { [weak self] in
            guard let self = self else {
                return
            }
            
            self.rtcAudioSession.lockForConfiguration()
            do {
                try self.rtcAudioSession.setCategory(AVAudioSession.Category.playAndRecord.rawValue)
                try self.rtcAudioSession.overrideOutputAudioPort(.speaker)
                try self.rtcAudioSession.setActive(true)
            } catch let error {
                debugPrint("Couldn't force audio to speaker: \(error)")
            }
            self.rtcAudioSession.unlockForConfiguration()
        }
    }
    
    private func setAudioEnabled(_ isEnabled: Bool) {
        let audioTracks = self.peerConnection.transceivers.compactMap { return $0.sender.track as? RTCAudioTrack }
        audioTracks.forEach { $0.isEnabled = isEnabled }
    }
}

extension WebRTCClient: RTCDataChannelDelegate {
    func dataChannelDidChangeState(_ dataChannel: RTCDataChannel) {
        debugPrint("dataChannel did change state: \(dataChannel.readyState)")
    }
    
    func dataChannel(_ dataChannel: RTCDataChannel, didReceiveMessageWith buffer: RTCDataBuffer) {
        self.delegate?.webRTCClient(self, didReceiveData: buffer.data)
    }
}

Copy the code

##3. Encapsulate and manage Webrtc module

Import Foundation import AVFoundation import WebRTC // MARK:- RtcConnectedState public enum RtcConnectedState {caseSucessed // It's connected successfullycaseFalure // The connection failedcase} protocol WebRTCManagerDelegate: class {func webRTCManager(_ manager: WebRTCManager, socketConnectState isSucessed: Bool) // WebrTC connection status func WebRTCManager (_ manager: WebRTCManager, didChangeConnectionState state: RTCIceConnectionState) } class WebRTCManager { staticlet shareInstance:WebRTCManager  = WebRTCManager()
    
    //private letsignalClient: SignalingClient var signalClient: SocketClient? var webRTCClient: WebRTCClient? Var sockitConfig: SocketConfig = socketconfig. default // Weak var delegate: Var remoteCandidate: Int = 0 var feedbackConnectedBlock: ((_ webClient: WebRTCClient)->())? // MARK:- Disconnect socket connection public funcdisconnect(){ self.signalClient? .disconnectSocket() self.webRTCClient? .disconnect() self.signalClient? .delegate = nil self.webRTCClient?.delegate = nil self.signalClient = nil self.webRTCClient = nil remoteCandidate = 0 } // MARK:- Start socket public func connectionconnect(){// print RTC diary //RTCSetMinDebugLogLevel(.verbose)
        //let logSignalClient = socketClient.init (hostStr: sockitConfig.host, port: sockitConfig.port) webRTCClient = WebRTCClient(iceServers: sockitConfig.webRTCIceServers) webRTCClient? .delegate = self signalClient?.delegate = self self.signalClient?.connect() } } extension WebRTCManager: SocketClientDelegate {// Socket login succeeded func signalClientdidLogin(_ signalClient: SocketClient) {logger.info("********socket login successful ************************"Func signalClientDidConnect(_ signalClient: SocketClient) { self.delegate?.webRTCManager(self, socketConnectState:true} / / MARK: - socket connection func signalClientDidDisconnect (_ signalClient: SocketClient) { self.delegate?.webRTCManager(self, socketConnectState:falseFunc signalClient(_ signalClient: SocketClient, didReceiveRemoteSdp) RTCSessionDescription) { logger.info("************ received SDP ****************************") // Set the remote SDP self.webrtcclient? .set(remoteSdp: sdp) { (error)inself.webRTCClient? .answer { (localSdp) inself.signalClient? .send(sdp:localSdp)} logger.error(error.debugDescription)}} // MARK:- ice func signalClient(_ signalClient: SocketClient, didReceiveCandidate candidate: RTCIceCandidate) { logger.info("************ received ice****************************") self.remotecandiDate += 1 // Set the remote ice self.webrtcclient? .set(remoteCandidate: candidate) } } extension WebRTCManager: WebRTCClientDelegate {/ / MARK: - receive local ice func webRTCClient (_ client: webRTCClient, didDiscoverLocalCandidate candidate: RTCIceCandidate) { logger.info("* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * found that the local ice candidate * * * * * * * * * *") self.signalClient? .send(candidate: candidate)} // MARK: -rtc connection status func webRTCClient(_ client: WebRTCClient, didChangeConnectionState state: RTCIceConnectionState) { self.delegate?.webRTCManager(self, didChangeConnectionState: state) switch state {case .connected, .completed:
            
            logger.info("* * * * * * * * * RTC connection state success * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *")
            if let block = feedbackConnectedBlock {
                block(client)
            }
            
        case .disconnected:
            logger.info("* * * * * * * * * RTC lose connection * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *")
            
        case .failed, .closed:
            
            logger.info("* * * * * * * * * RTC connection * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *")
        case .new, .checking, .count: break
            
        @unknown default: breakFunc webRTCClient(_ Client: webRTCClient, didReceiveData data: Data) { // DispatchQueue.main.async { //let message = String(data: data, encoding: .utf8) ?? "(Binary: \(data.count) bytes)"
//            let alert = UIAlertController(title: "Message from WebRTC", message: message, preferredStyle: .alert)
//            alert.addAction(UIAlertAction(title: "OK", style: .cancel, handler: nil))
//            self.present(alert, animated: true, completion: nil) //}}} continue to update in..... We have questions can QQ ME: 506299396Copy the code