Author: Li Chao, the article was first published in the RTC developer community, if you have any questions, you can click here to communicate with the author directly.

preface

This article will introduce two aspects of knowledge:

  • WebRTC signaling control
  • Setup of STUN/TURN server

In previous articles, you were shown how to build a signaling server. But how does the constructed signaling server work? Which messages need signaling server control and relay? These have not been explained in detail before, but this paper will discuss these issues in detail.

On the other hand, how does WebRTC perform NAT traversal on a real network? If the traverse is not successful, how can we ensure that the user is served? This knowledge will also be given in this article.

signaling

The architecture diagram of WebRTC signaling control is as follows:

Signaling servers are used to exchange three types of information:

  • Session control messages: initialization/shutdown, various business logic messages, and error reports.
  • Network related: Externally recognized IP address and port.
  • Media capabilities: the codec, resolution, and whom the client wants to communicate with can be controlled by the client.

Let’s discuss these three types of messages in more detail:

Session control message

Session control messages are simple, such as room create and destroy, Join room, leave room, audio on/audio off, video on/video off, and so on.

For a true commercial WebRTC signaling server, there are also many session control messages. For example, you can obtain the number of people in a room, mute/unmute, change the presenter, poll the video, paint brushes on the whiteboard, and various graphics. But it’s a relatively simple message.

In our previous example, the server handled only one session message, the CREATE or Join message, that is, the room creation and join message. The code is as follows:

. socket.on('create or join'.function(room) {

    var clientsInRoom = io.sockets.adapter.rooms[room];
    var numClients = clientsInRoom ? Object.keys(clientsInRoom.sockets).length : 0;

    if (numClients === 0) {
      socket.join(room);
      logger.debug('Client ID ' + socket.id + ' created room ' + room);
      socket.emit('created', room, socket.id);

    } else if (numClients === 1) {
      io.sockets.in(room).emit('join', room);
      socket.join(room);
      socket.emit('joined', room, socket.id);
      io.sockets.in(room).emit('ready');
    } else { // max two clients
      socket.emit('full', room); }}); .Copy the code

The logic of this code is very simple. When you receive the CREATE or Join message, determine the current number of people in the room. If the number of people in the room is 0, it means that the first person came in. If the number of people in the room is 1, it means that the second person comes in and the message joined needs to be sent to the client. Otherwise, send the full message, indicating that the room is full, because currently only two people are allowed in a room.

Network information message

Network information messages are used to exchange network information between two clients. Use ICE mechanism to establish network connection in WebRTC.

At each end of the WebRTC, when the RTCPeerConnection object is created and the setLocalDescription method is called, the collection of ICE candidates begins.

There are three types of candidates in WebRTC. They are:

  • Host candidate
  • Reflex candidate
  • Relay candidate

Host candidate, which represents the IP address and port on the local area network. It is the highest priority of the three candidates, meaning that at the bottom of WebRTC, the local area network (LAN) will be first tried to establish connections.

The reflection candidate represents the external IP address and port of the host within the NAT. It has a lower priority than the host candidate. That is, when WebRTC tries to connect to the local server, it tries to connect to the IP address and port obtained by the reflection candidate.

Its structure is shown in the figure below:

This is what we usually call P2P NAT traversal. The WebRTC detects the NAT type of the user and uses different methods to perform NAT traversal. However, if both NAT types are symmetric, P2P NAT traversal cannot be implemented. In this case, only the relay can be used.

The relay candidate represents the IP address and port of the relay server through which media data is relayed. When the communication between the two WebRTC clients cannot pass through P2P NAT, in order to ensure the normal communication, the service quality can only be guaranteed through the server transfer.

So the relay candidate has the lowest priority and is used only if neither of the above candidates is able to connect.

On the WebRTC signaling server, when receiving network signaling messages, that is, message messages, it directly forwards them without any processing. The code is as follows:

socket.on('message'.function(message) {
     socket.broadcast.emit('message', message);
});
Copy the code

After receiving a message, the client makes further judgment. If the message type is candidate, that is, network message signaling, an RTCIceCandidate object is generated and added to the RTCPeerConnection object, thus enabling WebRTC to automatically establish a connection at the underlying level. The code is as follows:

socket.on('message'.function(message) {
  ...
  } else if (message.type === 'candidate') {
    var candidate = new RTCIceCandidate({
      sdpMLineIndex: message.label,
      candidate: message.candidate
    });
    pc.addIceCandidate(candidate);
  } else if(...). {... }});Copy the code

Exchange media capability messages

In WebRTC, media capabilities are finally rendered through SDP. Before the transmission of media data, the first thing to do is to negotiate the media capability, to see which encoding methods are supported by both parties, which resolutions are supported, etc. The negotiation method is to exchange media capability information through the signaling server.

The WebRTC media negotiation process is shown in the figure above.

  • First, Amy calls the createOffer method to create the offer message. The content in the offer message is Amy’s SDP information.
  • In the second step, Amy calls the setLocalDescription method to save the local SDP information.
  • In the third step, Amy sends the offer message to Bob through the signaling server.
  • Fourth, after Bob receives the Offer message, he calls the setRemoteDescription method to store it.
  • Fifth, Bob calls the createAnswer method to create an Answer message. Once again, the content of the answer message is Bob’s SDP information.
  • Sixth, Bob calls the setLocalDescription method to save the local SDP information.
  • In step 7, Bob sends the Anwser message to Amy through the signaling server.
  • Step 8: After Amy receives the Anwser message, she calls the setRemoteDescription method to save it.

Through the above steps to complete the communication between the two sides of the media capability exchange.

The above are all the messages that the signaling server should process. These messages form the basic signaling of the signaling server. Each of them is essential, otherwise the two parties will not be able to communicate.

In WebRTC communication, signaling alone is not enough. Because what WebRTC really transmits is media data, signaling is just one part of it. In WebRTC, he tries to transfer data through P2P, but what happens when P2P crossing fails?

You need to forward media data through a media relay server. Let’s take a look at how to set up a media relay server.

Build STUN/TURN

Building a STUN/TURN service on a public network is not difficult. First of all, there is a cloud host, cloud host I will not do the introduction, we go to a cloud manufacturer to buy it.

At present, the most popular STUN/TURN server is Coturn, which is very convenient to build STUN/TURN service.

Let’s take a look at the basic steps:

  • Get the coturn source code

    git clone https://github.com/coturn/coturn.git
    Copy the code
  • Compile the installation

    cd coturn
    ./configure --prefix=/usr/local/coturn
    sudo make -j 4 && make install
    Copy the code
  • Configuration coturn

    There are many articles about coturn configuration on the Internet, which are very complicated. Most people copy and forward it from the Internet, with lots of mistakes. In fact, just use the default Settings of coturn, which I have compiled here, as follows:

    listening-port=3478        # Specifies the port to listen onExternal - IP = 39.105.185.198Specifies the public IP address of the cloud host
    user=aaaaaa:bbbbbb         # User name and password for accessing stun/turn services
    realm=stun.xxx.cn          This must be set
    Copy the code

    So, just put the above four lines written to the configuration items/usr/local/coturn/etc/turnserver. Conf configuration file, you stun/turn service is configured.

  • Start the STUN/TURN service

    cd /usr/local/coturn/bin turnserver -c .. /etc/turnserver.confCopy the code
  • Test stuN/TURN service

    Open Trickle-ice and enter the STUN /turn address, user and password as required to detect whether the STUN /turn service is normal.

    For example, in our configuration, the input information is:

    • The value of STUN or TURN URI is: TURN :stun.xxx.cn
    • The user name is aaaaaa
    • The password is BBBBBB

    The test results are shown in the figure below:


When STUN/TURN is deployed, we can use it to transmit multimedia data. We are no longer afraid of the communication failure caused by NAT and firewall.

summary

This paper first introduces the control and exchange of three types of WebRTC signaling messages in detail. Then the layout, configuration and test of STUN/TURN server are given.

It is important to note here that although the arrangement of STUN/TURN is very simple, the principle behind it, like WebRTC, is quite complex. Due to space reasons, I have not made a detailed introduction to you, interested students can use it as a starting point for in-depth research.