In the post-EPIDEMIC era, online learning, work, communication and entertainment have become the norm, which is driven by continuous innovation and breakthroughs in real-time audio and video technology. In order to provide enterprises and developers with the ultimate audio and video experience, the technology team of Bileyun will not only adopt the wide DC distribution and reduce the service to the last kilometer, but also switch the technical scheme according to the application scenario. If only two terminals participate in communication, the media direct connection scheme will be selected to reduce the server cost. Today’s computers and devices are usually behind firewalls, and it is not easy to establish a direct connection, so firewall traversal technology emerged. There are a variety of technical schemes for firewall traversal. This article will introduce one of the frameworks in detail — ICE protocol to help you master the basic flow of firewall traversal.

# 01

What is direct connection?

The direct connection mode is also called Peer to Peer. In P2P mode, the first step to establish a connection is to establish a connection. How to let the devices behind their respective NAT[RFC3489] establish a connection, that is, in the system design to maximize the success rate of P2P connection is very important. Next, we choose a standard protocol [RFC 8445] ICE (Interactive Connectivity Establishment) for analysis to explore how devices after NAT attempt peer-to-peer communication.

(Classic System Architecture Diagram)

# 02

ICE connection process

The core processing that ICE performs for NAT penetration involves collecting candidate address information, sorting, matching, and connectivity checks on the collected addresses.

A terminal has multiple candidate transport addresses (IP addresses and ports for specific transport protocols) to communicate with other terminals. It may include:

O Transport address on directly connected network interface (nic address)

O Translated transport address of the NAT public end (Server Reflexive service reverse address)

Relayed Address assigned by the O turn service

01

Noun explanation

Transport Address: contains IP, port, and Transport protocol.

Candidate: Transport Address includes the type, priority, Foundation, and Base.

Base: Host candidate Associates a Server reflexive candidate.

02

Candidate Address Collection

The terminal must determine all candidate addresses. These addresses include the address of the local network interface and all other addresses derived from it. Local network addresses include local NETWORK adapter addresses, VPN network addresses, and MIP network addresses. Derived addresses refer to the network addresses obtained by sending A STUN request to the STUN Server from a local address. These addresses are divided into two categories. One is the address obtained by STUN binding, called Server Reflexive Candidates or Server reverse addresses. The other type of Relay Candidates are obtained through TURN Relay.

An endpoint obtains additional reverse server addresses and server relay addresses by sending a Bind Request to the STUN Server and an Allocate message to the Turn Server via each host candidate address.

The following figure describes how to discover both Server Reflexive and Relay Candidates using the TURN service:

Obtain the sequence diagram of server RefLEXIve and Server Relay Address.

When an endpoint sends a Turn allocate message through NAT, NAT binds an IP address with IP address IP1 and Port Port1, and maps the reverse address of the candidate server to the host address. After the packet is sent from the host candidate ADDRESS, the NAT translates it into the reverse address of the server. Finally sent to the destination address. The packet destined for the reverse candidate IP address of the server is translated into the host candidate IP address by NAT and forwarded to the terminal.

When multiple NAts exist between the terminal and the STUN server, the STUN request will create a binding for each NAT, but only the most external server reverse address will be discovered by the terminal. If the terminal is not behind any NAT, then the Base candidate transport address is the same as the server reverse address, which can be ignored.

When turn allocate reaches turn server, turn server allocates an IP address with IPy port Porty to the request, and adds this information to the response message. At the same time, the server reverse address IP1 and Port1 are added to the reply message sent back to the terminal. Turn Server provides the relay to forward the information carried by the terminal to join the session to the peer end. In this way, the terminal has its own local candidate ADDRESS information and the candidate address information of the peer end, which provides the address information basis for subsequent connection.

03

Sort and match candidate addresses

After collecting all the candidate addresses, terminal 1 sorts them in order of priority and sends them to terminal 2 through the signaling channel. These candidate addresses are transmitted as attributes of the SDP request. When terminal 2 receives a request, it performs the same address collection process and sends its own candidate address as a response message to the requester. This way, each endpoint will have a complete list of candidate addresses for both parties, which will then be ready to perform connectivity checks.

The basic principles of connectivity checking are as follows:

O Sort candidate addresses in priority order;

O Send a check packet with each candidate address;

O Acceptance check package received from another terminal.

The following figure shows connection detection:

(Filtering of valid address pairs)

The terminal pairs the local address set with the remote address set. If the local address set has two addresses and the peer address set has three addresses, the terminal creates six address pairs (2 x 3=6). Terminal A selects A local candidate address to send A STUN request to terminal B’s server reverse address, and receives A STUN response saying that the address pair is acceptable. When the local address in the address pair of terminal A receives A STUN request from the remote address in the address pair and responds successfully, the address pair is said to be sendable. If an address pair can be received and sent at the same time, the address pair is valid, that is, the address pair passes the connectivity check. The above describes the filtering process of an address pair. All valid address pairs can be screened by the round-trip interaction between terminal A and terminal B through multiple address pairs.

04

Sort the candidate addresses

Because all candidate addresses are collected during the collection of candidate addresses, sorting all combinations is imperative in order to find the candidate address pairs that can work faster and better. Here are two basic principles for sorting:

O The terminal sets a numerical priority for each of its candidate addresses, which is sent along with the candidate address pair to the communicating peer.

O Calculate the priority of the candidate address pair by combining the priorities of the local and remote candidate addresses. In this way, the priorities of the same candidate address pair of the two parties are the same. In this case, the sorting results of the communication parties are the same.

Detailed algorithm: see implementation specification.

05

Freeze the candidate

Each candidate is associated with a property called FOUNDATION, and the foundations of both candidates are “identical” — the same host candidate, using the same STUN service protocol. Otherwise their foundations are different. Candidate address pairs also have foundation, just associated with two candidate addresses. In the initial stage, only the foundation address pair is detected, and the other candidate address pairs are frozen. When the connectable detection succeeds, unfreeze the same candidate address pair as foundation currently, so as to avoid double-checking candidate address pairs that appear to be more likely to succeed but will actually fail.

#03

ICE implementation specification

01

Foundations to calculate

Both candidate addresses should have the same Foundation ID:

O The same type (host, server mapping, trunk, peer mapping).

O Base has the same IP address (port may be different).

O use the same transport protocol.

O Server reverse and relay addresses are the same when STUN or turn server gets its IP address.

Otherwise you have to use a different ID.

02

Sort of candidate address pairs

Candidate address priority calculation formula:

priority = (2^24)(type preference) + (2^8)(local preference) + (2^0)*(256 – component ID)

The type preference must be an integer between 0 and 126 inclusive, with 0 being the lowest priority and 126 the highest priority.

The local preference must be an integer between 0 and 65535 inclusive. 0 indicates the lowest priority, and 65535 indicates the highest priority.

The Component ID must be an integer between 0 and 256 inclusive.

Selection criteria for type preferences and local preferences:

The first standard: the standard is the use of media intermediaries, such as TURN Server, VPN Server or NAT. The relay address is of this type, and the host address is derived from the VPN interface. The type preference value of the trunk address must be smaller than that of the host address. You are advised to set the host candidate address to 126, 100 for the reverse address of the server, 110 for the remote address, and 0 for the trunk address.

Other criteria are based on IP address clusters, security, and network topology (not detailed here for amplifying reasons).

03

Role selection

Each endpoint plays a role for each session in the ICE process. There are two roles – to control and to be controlled. The controlled terminal is responsible for the selection of the last pair of candidate pairs for communication. This means nominating candidate address pairs for each media stream. The control terminal is told which candidate pairs are used for each media stream.

The rules for determining roles are as follows:

O Both ends are fully implemented: one end is the controller at the time of the request, and the other end is the controlled. Both ends have to run ICE’s state machine to do connectivity testing.

O one end is full implementation, the other end is light implementation: the full implementation of the request as a controller to run ICE state machine, connectivity detection; Light implementer as controlled.

O Both ends are light implementations: one end is the controller at the time the request is made, and the other is the controlled.

04

Resolution of role conflicts

In some particular processes, it can lead to both sides of the conversation thinking they are in control or under control. As shown below:

(Conflict of roles)

Controller is B2BUA and serves as the intermediate call associated between A and B. For A and B, both offerer, which will lead them to believe that they play controlling roles in the ICE process.

In order to resolve role conflicts, the bind request sent in the Connectivity Check phase requires ice-controlled or ice-controlling STUN properties associated with the role. Both of these attributes carry a field such as a Tie Breaker (value 0-2 ^ 64-1) that contains a native-generated random value. The party receiving the bind request checks these two fields, and if they conflict with the current native role, checks the native tie Breaker value and the tie Breaker value carried in the message to determine the appropriate native role. The method of determination is Tie breaker and the party with the largest value is controlling. If the local end is determined to change the role, the role will be directly modified. If the peer is judged to have changed roles, then the bind request sends a 487 error response. The end that received the error response can change roles.

STUN CONTROLLING 0x8029 ICE-controlled 0x802A ice-controlling Role Conflict The role specified by the terminal conflicts with the role specified by the server.

05

Candidate address pair priority calculation and sorting

Once the candidate address pairs have been identified, they need to be prioritized. G is the priority of the address of the controller, and D is the priority of the address of the controlled.

The priority calculation formula of the candidate address pair is as follows:

pair priority = 2^32MIN(G,D) + 2MAX(G,D) + (G>D? 1-0)

Once a candidate pair priority is assigned, the candidate pairs are paired in descending order of priority. If two pairs have the same priority, the order between them is arbitrary.

06

Candidate address pair state machine

(Address to state transition diagram)

O At the beginning, the terminal sets the status of all candidate address pairs to Frozen.

O Terminal checks the list in the first media stream:

For all candidate pairs with the same Foudation, set the address pair for the minimum componentID to Wait state, if number

If there is more than one address pair, the address pair with the highest priority is set to Wait.

O The outcome of the process is either success or failure.

07

The connection to check

This process is only available for full implementations and is divided into normal checks and trigger checks, both of which are timing driven.

The terminal holds a first-in, first-out queue, called the trigger queue, which holds candidate pairs that will be triggered the next time. When the timer is triggered, the terminal takes the top candidate pair from the trigger queue, checks the connection, and sets the status of the candidate pair to In Progress. If the trigger queue is empty, the normal check is performed.

Once the terminal has organized candidate pairs, set timers for each of the active checklists, N timers for each of the N, with intervals of Ta*N seconds.

When the timer is triggered but no check can be sent, the terminal must switch to the usual check flow as follows:

08

ICE Restart

The terminal may restart ICE for media streams, and the original state will revert to the new state. The difference with the new session is that during the restart, media can still be sent from the previous valid address pair. An endpoint must restart ICE when a column condition occurs:

O The offer is generated to change the specified media stream. In other words, if an endpoint wants to generate an updated offer and ICE is not using it, it will restart ICE for that media stream

O An endpoint changes its implementation level, which usually only happens in third-party call control use cases.

These rules mean that setting the IP address in line C to 0.0.0.0 will cause ICE to restart. Therefore, the ICE implementation cannot use this mechanism for call hold and must use a=inactive and a= Sendonly in the SDP protocol. To restart ICE, the agent must change ice-PWD and ICE-UFRAg for the media stream in the offer.

09

Transmission of address information

As we all know, ICE protocol is only responsible for collecting communication addresses and sorting them, so that both sides of communication know the network environment and communication address they are in, and also through the server relay. Multimedia communication is generally encapsulated by SDP[RFC 4566] protocol. Here are the Attribute Extensions for the ICE protocol.

a=ice-pwd:asd88fgpdd777uzjYhagZg a=ice-ufrag:8hhY

A =candidate:1 1 UDP 2130706431 10.0.1.1 8998 TYP host 0 network-id 1 network-cost 10 A =candidate:2 1 UDP 1694498815 192.0.2.3 45664 TYP SRFLX raddr 1 network-id 2 network-cost 50

Ice-ufrag and ICE-PWD are transmitted in the SDP to verify the security and validity of STUN information. The candidate attribute carries valid communication address information, Component-id, communication protocol (TCP or UDP), Foundation, IP, port, type, relate_address, network-ID, and network-cost start from 1.

#04

conclusion

This article introduces some important basic knowledge of ICE protocol, but it only establishes the FRAMEWORK of P2P, and needs to cooperate with other protocols to meet the requirements of direct connection, and ICE protocol itself covers many topics and implementation specifications: divided into lightweight and full implementation. If you are interested in the full process and description, you can read the RFC documentation carefully.

References:

STUN:datatracker.ietf.org/doc/rfc5389…

SDP:datatracker.ietf.org/doc/rfc4566…

ICE:datatracker.ietf.org/doc/rfc8445…