The problem

Swoole, the Websocket Server of our project, recently found that although the Websocket protocol was successfully upgraded, there would be periodic reconnection, and heartbeat and data were not sent. The project’s production environment is consistent with beta, but the production environment does not have this problem.

Location problem

For the convenience of debugging Swoole, the following tests are performed in a local environment.

Viewing PHP Logs

Error: ErrorException: Swoole\WebSocket\Server::push(): The connected client of connection[47] is not a websocket client or closed, indicating that the Websocket connection has been closed.

caught

Since the connection was closed, let’s see who closed the connection voluntarily. Swoole monitors port 1215. It can be seen from tcpdump-nni LO0-x port 1215 that Swoole sends Fin message segment after sending response message of protocol upgrade, that is, Swoole actively disconnects. Therefore, the browser displays that the WebSocket connection is established successfully, but the connection is reconnected periodically.

10:22:58.060810 IP 127.0.0.1.1215 > 127.0.0.1.53823: Flags [P.], seq 1:185, ack 1372, win 6358, options [nop,nop,TS val 1981911666 ecr 1981911665], length 184 0x0000: 4500 00ec 0000 4000 4006 0000 7f00 0001 E..... @. @... 0x0010: 7f00 0001 04bf d23f 9377 304a 6d2f 9604 ....... ? .w0Jm/.. 0x0020: 8018 18d6 fee0 0000 0101 080a 7621 9272 ............ v! .r 0x0030: 7621 9271 4854 5450 2f31 2e31 2031 3031 v! .qhttp /1.1.101 0x0040:2053 7769 7463 6869 6e67 2050 726f 746f.switching.Proto 0x0050: 636f 6c73 0d0a 5570 6772 6164 653a 2077 cols.. Upgrade:.w 0x0060: 6562 736f 636b 6574 0d0a 436f 6e6e 6563 ebsocket.. Connec 0x0070: 7469 6f6e 3a20 5570 6772 6164 650d 0a53 tion:.Upgrade.. S 0x0080: 6563 2d57 6562 536f 636b 6574 2d41 6363 ec-WebSocket-Acc 0x0090: 6570 743a 2052 6370 3851 6663 446c 3146 ept:.Rcp8QfcDl1F 0x00a0: 776e 666a 6377 3862 4933 6971 7176 4551 wnfjcw8bI3iqqvEQ 0x00b0: 3d0d 0a53 6563 2d57 6562 536f 636b 6574 =.. Sec-WebSocket 0x00c0: 2d56 6572 7369 6f6e 3a20 3133 0d0a 5365 -Version:.13.. Se 0x00d0: 7276 6572 3a20 7377 6f6f 6c65 2d68 7474 rver:.swoole-htt 0x00e0: 702d 7365 7276 6572 0d0a 0d0a p-server.... 10:22:58.060906 IP 127.0.0.1.53823 > 127.0.0.1.1215: Flags [.], ack 185, win 6376, options [nop,nop,TS val 1981911666 ecr 1981911666], length 0 0x0000: 4500 0034 0000 4000 4006 0000 7f00 0001 E.. 4.. @. @... 0x0010: 7f00 0001 d23f 04bf 6d2f 9604 9377 3102 ..... ? . m/... w1. 0x0020: 8010 18e8 fe28 0000 0101 080a 7621 9272 ..... (...... v! .r 0x0030: 7621 9272 v! .r 10:22:58.061467 IP 127.0.0.1.1215 > 127.0.0.1.53823: Flags [F.], seq 185, ack 1372, win 6358, options [nop,nop,TS val 1981911667 ecr 1981911666], length 0 0x0000: 4500 0034 0000 4000 4006 0000 7f00 0001 E.. 4.. @. @... 0x0010: 7f00 0001 04bf d23f 9377 3102 6d2f 9604 ....... ? .w1.m/.. 0x0020: 8011 18d6 fe28 0000 0101 080a 7621 9273 ..... (...... v! .s 0x0030: 7621 9272 v! .rCopy the code

Trace Swoole source code

We now know that Swoole disconnected voluntarily, but when and why did it disconnect? Let’s take a look at the source code.

According to the packet capture result, the time between sending response packets and close connection is very short, so it is speculated that there is a problem in the handshake phase. Websocket () = true; Websocket () = true; The connection should be closed in swoole_websocket_handshake().

// // swoole_websocket_server.cc int swoole_websocket_onHandshake(swServer *serv, swListenPort *port, http_context *ctx) { int fd = ctx->fd; bool success = swoole_websocket_handshake(ctx); if (success) { swoole_websocket_onOpen(serv, ctx); } else { serv->close(serv, fd, 1); } if (! ctx->end) { swoole_http_context_free(ctx); } return SW_OK; }Copy the code

It’s traced to swoole_websocket_handshake(), which sets the header of the response, and the response message is sent in swoole_http_response_end(), The result is also the result of swoole_websocket_handshake.

// swoole_websocket_server.cc bool swoole_websocket_handshake(http_context *ctx) { ... swoole_http_response_set_header(ctx, ZEND_STRL("Upgrade"), ZEND_STRL("websocket"), false); swoole_http_response_set_header(ctx, ZEND_STRL("Connection"), ZEND_STRL("Upgrade"), false); swoole_http_response_set_header(ctx, ZEND_STRL("Sec-WebSocket-Accept"), sec_buf, sec_len, false); swoole_http_response_set_header(ctx, ZEND_STRL("Sec-WebSocket-Version"), ZEND_STRL(SW_WEBSOCKET_VERSION), false); . ctx->response.status = 101; ctx->upgrade = 1; zval retval; swoole_http_response_end(ctx, nullptr, &retval); return Z_TYPE(retval) == IS_TRUE; }Copy the code

From the swoole_http_response_end() code, if CTX ->keepalive is 0, the connection is closed. CTX -> Keepalive is set to 1.

// swoole_http_response.cc void swoole_http_response_end(http_context *ctx, zval *zdata, zval *return_value) { if (ctx->chunk) { ... } else { ... if (! ctx->send(ctx, swoole_http_buffer->str, swoole_http_buffer->length)) { ctx->send_header = 0; RETURN_FALSE; } } if (ctx->upgrade && ! ctx->co_socket) { swServer *serv = (swServer*) ctx->private_data; swConnection *conn = swWorker_get_connection(serv, ctx->fd); // statue is now WEBSOCKET_STATUS_ACTIVE, If (conn && conn-> webSocket_status == WEBSOCKET_STATUS_HANDSHAKE) {if (CTX ->response.status == 101) { conn->websocket_status = WEBSOCKET_STATUS_ACTIVE; } else { /* connection should be closed when handshake failed */ conn->websocket_status = WEBSOCKET_STATUS_NONE; ctx->keepalive = 0; } } } if (! ctx->keepalive) { ctx->close(ctx); } ctx->end = 1; RETURN_TRUE; }Copy the code

Finally we find that CTX ->keepalive is set in swoole_http_should_keep_alive(). As we know from the code, when HTTP is 1.1, keepalive depends on the header not setting Connection: close; In version 1.0, header must be set to Connection: keep-alive.

The Websocket protocol specifies that the Connection in the request header must be set to Upgrade, so we need to switch to HTTP/1.1.

int swoole_http_should_keep_alive (swoole_http_parser *parser) { if (parser->http_major > 0 && parser->http_minor > 0) { /* HTTP/1.1 */ if (parser->flags & F_CONNECTION_CLOSE) {return 0; } else { return 1; }} else {/* HTTP/1.0 or earlier */ if (parser->flags & F_CONNECTION_KEEP_ALIVE) {return 1; } else { return 0; }}}Copy the code

To solve the problem

From the conclusion above, we can see that the key issue is the Connection and HTTP protocol versions of the request headers.

LBS in production environments change HTTP to 1.1 when forwarding requests. This is why only beta environments have this problem. Access_log on Nginx confirms this.

The complete configuration of nginx is as follows.

Upstream service {server 127.0.0.1:1215; } server { listen 80; server_name dev-service.ts.com; location / { proxy_set_header Host $http_host; proxy_set_header Scheme $scheme; proxy_set_header SERVER_PORT $server_port; proxy_set_header REMOTE_ADDR $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection $connection_upgrade; Proxy_http_version 1.1; proxy_pass http://service; }}Copy the code

After restarting Nginx, Websocket finally works

As long as you can guarantee your salary to rise a step (constantly updated)

I hope the above content can help you. Many PHPer will encounter some problems and bottlenecks when they are advanced, and they have no sense of direction when writing too many business codes. I have sorted out some information, including but not limited to: Distributed architecture, high scalability, high performance, high concurrency, server performance tuning, TP6, Laravel, YII2, Redis, Swoole, Swoft, Kafka, Mysql optimization, shell scripting, Docker, microservices, Nginx and many other knowledge points can be shared free of charge to everyone, you can join my PHP technology exchange group 953224940

>>> Architect growth path