When WebRTC implements end-to-end audio and video communication, one or both Intranet addresses are hidden behind NAT. ICE is used to find a connection to transmit data. This article introduces the basics and main steps of NAT traversal and the ICE framework.

We know that when WebRTC is used for end-to-end real-time audio and video communication, WebRTC itself is based on peer-to-peer connection. The most convenient way is for both sides of a call to connect directly through IP, instead of the original broadcast server transfer mode.

If both parties are public IP addresses, they can directly access each other to establish a connection. However, in most application scenarios, one or both parties are not public addresses but Intranet addresses hidden behind Network Address Translation (NAT).

For example, 192.168.2.232 on LAN A cannot communicate with 192.168.2.161 on LAN B directly. Most personal devices to be connected are hidden in their own Intranet. Therefore, they cannot obtain client IP addresses or implement P2P audio and video communication directly.

To solve this problem, WebTRTC adopts the ICE technology framework to implement NAT traversal.

1. NAT network address translation

1. What is NAT

You’ve probably heard reports of IPv4 addresses running out before. IPv4 addresses are only 32 bits long, with a theoretical maximum of 4.29 billion. Around 1994, Network Address Translation (NAT) RFC specification was proposed as a temporary solution to solve the problem of IPv4 Address exhaustion.

The solution is to reuse IP addresses, introduce NAT devices on the edge network, and maintain the mapping of local service IP addresses and ports to public IP addresses and ports. The local IP address space within NAT can be reused by many different sub-networks to solve the problem of address exhaustion.

However, the latter interim plan quickly became the final one and became part of the Internet infrastructure. It is not only used to solve the problem of IP address exhaustion, you can find routers, firewalls, proxy devices have NAT function.

2. NAT types

NAT has been studied and summarized in two general types: conical and symmetric. The cone can be subdivided into three types.

So in summary, there are four types: full cone, IP-limited cone, port-limited cone, and symmetric cone.

a. Completely conical

When the Intranet host communicates with the extranet, the NAT is perforated. This process is to create an extranet mapping table on the NAT. This table records IP port mappings on the extranet and extranet. The communication between the extranet machine and P address and P port is forwarded to the corresponding Intranet address and port on the NAT, so that the extranet machine can communicate with the Intranet host machine.

b. Type the IP restrictions

The IP restriction cone is more strict, allowing only IP addresses accessed by the Host to pass through the hole on a fully conical basis. (The mapping table mostly records the IP addresses accessed from the Internet.) As shown in the figure, other Internet hosts such as A and C try to pass the IP port drilled by computer B, but they also fail to communicate with Host.

c. Port limit cone type The port conical is more strict than the IP conical conical, which does not restrict ports. In IP cone-restricted mode, other port services on the same external Host can communicate with the internal Host. However, in the case of port restriction, only the port service specified by the extranet machine is allowed to pass through the hole. The mapping table records what is accessedIP address and port of the extranet)

D. symmetrical type

Symmetric, which is the strictest NAT type, also has IP and port restrictions (as with the port restriction cone). The biggest difference is that each connection opens a new IP address on the NAT or a new IP address and port to communicate with the external network. Which means that every NAT hole is different. Basically symmetric NAT is impenetrable.

3. Check the NAT type

Based on the preceding information, we can learn that in actual network scenarios, the network environment of various devices is different. Therefore, it is very important to determine the network type of the device before communication between devices.

The following figure shows the flow chart of Session Traversal Utilities for NAT (STUN). When it ends in red, it indicates that the traversal failed and UDP communication is unavailable. Communication is possible only when you go to green (public address) or yellow (cone NAT).

The whole process initiated approximately five tests

test1

  • The host sends a request to the IP port of the server. The server returns a request through the same IP port. Yes: next. No: The UDP is unavailable.
  • Is it the same address? Yes: If no, go to Test2. No: After NAT, go to Test3.

test2

  • Check whether the returned external IP address is the same as the host IP address. Yes: Yes public IP address. No: A symmetric firewall exists.

test3

  • After NAT, the host sends a request to the server. The server returns a request using another NIC IP address and a different port. Yes: completely conical; No: limit cone =>test4.

test4

  • The host sends a request to another server to check whether the public IP addresses obtained on the host are the same as those on the other server. Yes: asymmetric (at least unlimited IP address); No: Symmetric =>test5.

test5

  • The host sends another request to the server on test4, and the server returns with the same IP address but a different port. Yes: IP limit cone; No: port limit cone

The above introduction to NAT and its tunneling and traversal principles. The actual traversal is more detailed and complex than this, and can be understood by referring to the corresponding RFC specification RFC5389, but the whole time is still quite fast.

STUN/TURN protocol

As mentioned earlier, guest hosts are inevitably behind firewalls or NAT. During UDP transmission, only the HOST of the NAT is carried. If there is no entry from the target machine, it will not be forwarded to the target machine. There is no problem in the client-server case. In peer-to-peer mode, the transmission fails. Therefore, WE need to use STUN/TURN mode for NAT penetration.

WebRTC uses Interactive Connectivity Establishment (ICE) to establish end-to-end data channels. ICE has two utility protocols: STUN(Session Traversal Utilities for NAT) and TURN (Travelsal Using Relays around NAT).

1. STUN

A. Standard specification definition

STUN, first defined in RFC3489, is a complete NAT Traversal solution (Simple Traversal of UDP Through NATs).

STUN is defined in RFC5389 as providing a tool for NAT Traversal, rather than a complete solution. Session Traversal Utilities for NAT

B. STUN purposes

  • Session participants obtain the IP address and port of each other
  • Detects connectivity between two endpoints
  • Maintaining NAT Binding

C. Simple process of STUN service

The Intranet host needs to use the STUN server to obtain the IP address and port of the Internet after NAT mapping. The following is a simple procedure.

  1. First in the construction of a STUN server, now the more popular STUN server is CoTURN.
  2. The Intranet host sends a STUN message with a Binding Request to the STUN server.
  3. After receiving a Binding request, the STUN server fills the REQUESTED IP address and port number into a Binding response message, and then returns the message to the Intranet host.
  4. The Intranet host receiving the Binding Response parses the binding Response message and obtains its own external IP address and port from it.

2. TURN

A. Standard specification definition

Traversal Using Relays around NAT (TURN), defined in RFC5766: Relay Extensions to Session Traversal Utilities for NAT (STUN) : a Relay extension of STUN. Simply put, TURN is a “middleman” way of communication between two parties to achieve penetration. Note: Relay or TURN is the same concept as value Relay transmission.

B. use

When THE STUN service detects that the connection is directly in the form of peer-to-peer, it adopts the TURN mode and uses the relay connection service provided by the intermediate node. The TURN protocol is used to allow the host to control trunk operations and exchange data with the peer using the trunk. TURN differs from other relay control protocols in that it allows a client to connect to multiple peers using a single relay address.

3. ICE connection mechanism

1. Collect ICE candidates

When WebRTC ends need to be connected, each end provides multiple candidates. For example, if one end has two network adapters, the different ports of each network adapter correspond to one candidate.

There are three main types of ICE candidates:

  • Host type: indicates the IP address and port of the local Intranet
  • SRFLX: indicates the IP address and port of the extranet after NAT mapping
  • Relay type: indicates the IP address and port of the relay server

Generally, it consists of the following fields

IP: xxx.xxx.xxx. XXX \IP address port: number \ port type: host/ SRFLX /relay \ type Priority: number \ priority protocol: UDP/TCP \ Transport protocol usernameFragment: string \ Access service username...Copy the code

2. ICE connection process

A. Connectivity check

After the ICE candidates are collected, the two parties exchange through the signaling channel and get each other’s ICE candidates, WebRTC starts connectivity detection in priority order.

In general, the host candidate has the highest priority, followed by the SRFLX candidate, followed by the relay candidate.

At this stage, the ICE agent generates a STUN response for any authenticated STUN connection request from the peer. Candidates are checked in the previous order. When a response is received, the check is considered successful. If no response is received after the check times out, the candidate pair fails.

The principles of connectivity detection are summarized as follows:

  • Sort the candidate address pairs with a priority
  • Send checks requests in this priority order
  • The checks confirmation message is received from other terminals. Procedure

B. Select the candidate

In the WebRTC P2PTransportChannel will maintain connection state table, and sorted records in the table (SortConnectionsAndUpdateState). Sorting refers to calculating the connection “cost” of each record, ranking the least expensive first. How to calculate the cost? There are many factors involved, such as the time it takes to send a Stun request to receive a reply, and the “cost” of less time is naturally lower. When one end has video RTP data to send, the first record in the status table is checked, and if it is judged to be in a ready state, the Connection is used to send. Otherwise, the sending task is abandoned.

C. ICE Long connection and restart

To ensure that NAT mapping and filtering rules do not time out during audio and video calls, ICE continuously checks the connection of candidate pairs (channels) in use every 15 seconds. This ensures that data packets continue to be sent even when no data flows are sent, such as when audio and video streams are paused.

When the ICE agent detects a change or connection to the transport address in use, the restart ICE event is triggered, which is a return to the process of collecting the ICE candidate and beyond.

Four, summary

In fact, WebRTC’s ICE is a framework including STUN and TURN protocols to find an available and optimal data transmission channel connection. Understanding the basics of NAT traversal and the ICE framework will make it easier to understand how WebRTC transmission sets up connection steps and transfers data. Of course, the actual ICE process is much more complicated, this article briefly introduces the main steps of the answer, interested readers can refer to RFC5245.

Reference articles and specifications:

Developer.mozilla.org/en-US/docs/…

Datatracker.ietf.org/doc/html/rf…