“This is the 29th day of my participation in the First Challenge 2022. For details: First Challenge 2022”

One, foreword

In order for messages to be delivered reliably over the long link, you need to maintain the long link.

The “heartbeat mechanism” can solve the following three problems:

  1. Reduces the overhead of server connection maintenance for invalid connections
  2. The client can quickly identify invalid connections and automatically disconnect and reconnect
  3. The connection is alive and protected from carriersNATTimeout disconnect

IMThe system center jump mechanism is combinedTCP KeepAliveAnd the application layer heartbeat to accomplish:

  • TCP KeepAlive: Heartbeat probe connection availability at the transport layer, used to eliminate network unavailability
  • Application layer heartbeat: Transport layer detection to rule out application service unavailability

Second, heartbeat detection mode

Heartbeat detection modes:

  • TCP Keepalive
  • Application layer heartbeat
  • Intelligent heartbeat

(1)TCP Keepalive

TCPKeepaliveAs an operating systemTCP/IPPart of the protocol stack implementation for nativeTCPDuring the connection idle period, it automatically sends probe packets without data to check whether the peer is alive.

This function is enabled by default after HTTP 1.1.

There are three default configuration items:

  • tcp_keepalive_time: The heartbeat interval is 2 hours (7200s).
  • tcp_keepalive_intvl:KeepAliveThe interval for sending probe packets is 75 seconds
  • tcp_keepalive_probesIn:tcp_keepalive_timeIf no confirmation is received, the number of times the probe packet is sent. The default value is 9
#Viewing the Development Environment
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
$ cat /proc/sys/net/ipv4/tcp_keepalive_intvl
$ cat /proc/sys/net/ipv4/tcp_keepalive_probes

#By modifying the etc/sysctl.conf file 
Copy the code

likenginxIn the Settings:nginx.conf

http {
    sendfile            on;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   75s;
    keepalive_requests  1000;
    types_hash_max_size 2048;

    client_header_buffer_size  32k;
    large_client_header_buffers  4  32k;
Copy the code

TCP KeepaliveCannot detect whether the application service is available.

  • TCPIn transport layer: to testIPAnd whether the port is valid
  • If the application service code is deadlocked, suspended, etc.,TCP KeepaliveYou can’t detect it

(2) Heartbeat of the application layer

To address some of the limitations of TCP Keepalive, many IM services use application-layer heartbeat to improve the flexibility and accuracy of probing.

Application layer heartbeatTCP Keepalive: Indicates the availability of the application rather than the availability of the network because of the sending and receiving processes at the application layer.

Principle: The application layer heartbeat means that the client sends a service layer data packet to the IM server at regular intervals to inform the client that it is alive.

Flexible and configurable heartbeat interval has more obvious advantages in saving network traffic and keeping alive:

A slightly more complicated policy is for the client to send heartbeat packets only after sending data is idle.

  • WhatApps: The heartbeat interval of the application layer is 30 seconds and 1 minute
  • Wechat: The heartbeat interval of the application layer is four and a half minutes
  • Weibo: The heartbeat interval of the application layer is 2 minutes
  • QQ: The heartbeat interval at the application layer is 45 seconds

The client and server use the heartbeat mechanism respectively to implement “disconnection reconnection” and “resource clearing” :

(3) Intelligent heartbeat

In the domestic mobile network scenario, the NAT timeout duration varies greatly among local carriers on different network types.

  • The application layer heartbeat with fixed frequency is relatively simple to implement
  • But to avoidNATIf the heartbeat interval times out, you can only set the heartbeat interval to smaller than that in all network environmentsNATThe minimum timeout period
  • It solves the problem, but for the deviceCPU, power, network traffic resources can not achieve maximum savings

Optimize this phenomenon: adopt a “smart heartbeat” solution

  • Balancing NAT Timeout
  • Device Resource Saving

Intelligent heartbeat: Allows the heartbeat interval to be automatically adjusted based on the network environment. By automatically adjusting the heartbeat interval, the network gradually approaches the NAT timeout threshold and saves device resources without NAT timeout.

Third, in actual combat

(1)websocketCenter jump


ChannelInitializer<SocketChannel> initializer = new ChannelInitializer<SocketChannel>() {

    protected void initChannel(SocketChannel ch) throws Exception {
        ChannelPipeline pipeline = ch.pipeline();
        // Add webSocket-related codecs and protocol handlers first
        pipeline.addLast(new HttpServerCodec());
        pipeline.addLast(new HttpObjectAggregator(65536));
        pipeline.addLast(new LoggingHandler(LogLevel.DEBUG));
        pipeline.addLast(new WebSocketServerProtocolHandler("/".null.true));
        // Add the total handler for the server business message
        // Add an idle handler to the server. If there is no message transfer in the socket for a period of time, the server will forcibly disconnect
        // Read 0s is not restricted, write 0s is not restricted, total interval
        pipeline.addLast(new IdleStateHandler(0.0, serverConfig.getAllIdleSecond())); pipeline.addLast(closeIdleChannelHandler); }};Copy the code

Front-end code:

function init() {
    if (window.WebSocket) {
        websocket = new WebSocket(Ws: / / "");
        websocket.onmessage = function (event) {

        websocket.onopen = function () {
            // Send heartbeat periodically

        websocket.onclose = function () {

        websocket.onerror = function () {

    } else {
        alert(Your browser does not support the WebSocket protocol!); }}/** The heartbeat packet is sent every 2 minutes, and the message or server response is reset to reset the timer. * /
var heartBeat = {
    timeout: 120000.timeoutObj: null.serverTimeoutObj: null.reset: function () {
    start: function () {
        var self = this;
        this.timeoutObj = setTimeout(function () {
            var sender_id = $("#sender_id").val();
            var sendMsgJson = '{ "type": 0, "data": {"uid":' + sender_id + ',"timeout": 120000}}';
            self.serverTimeoutObj = setTimeout(function () {
                $("#ws_status").text("Lost connection!");
            }, self.timeout)
        }, this.timeout)
Copy the code

(2) Actual production

pipeline.addFirst("idleStateHandler".new IdleStateHandler(nettyChannelTimeoutSeconds, 
pipeline.addAfter("idleStateHandler"."idleEventHandler", timeoutHandler);
Copy the code