preface

This paper is the first study of Websocket protocol, for real applications, various languages have implementation libraries, it is recommended to use the library, rather than their own implementation, this paper is based on Node.js, but other languages are applicable

This paper mainly describes:

  • Knowledge of Websocket
  • Protocol outline for Websocket
  • HTTP and Websocket hybrid server simple demo

Why implement a hybrid HTTP and Websocket server? Because a friend was made to cry by his CTO… His technical director told him to use PHP to connect to other Websocket servers because: websocket address…. cannot appear in JS I was reading RFC6455, so I jokingly said, implement an HTTP+Websocket server 😏

Then, I wrote a such a demo, by the way, has realized the chat room, by the way), also from the original simple 50 lines of code, changed into 150 lines, this demo of pure entertainment, learning in the practical work, it is best not to write, unless you deliberately, below in the demo, some problems, because it is simple and easy to realize the cause of the demo, just No more extensions




Results the screenshot

Knowledge of WebSocket

WebSocket is briefly

WebSocket is a two-way data transmission protocol based on TCP. Like HTTP, WebSocket is in the application layer. His appearance is to solve the problem of persistent two-way transmission of data on web pages

The relation between WebSocket and HTTP and TCP connection

Both WebSocket and HTTP are actually a TCP connection, The purpose of WebSocket and HTTP is to specify the protocol they use for TCP conversations. See RFC6455 documentation for version:13 for the protocol HTTP1.1, see RFC 2616

The WebSocket protocol request (handshake), which is HTTP compatible, can be understood as an “upgrade “, but the reply rules are different

Websocket protocol overview

The basic steps of WebSocket are as follows:

  1. The TCP connection is made first
  2. The client sends a handshake
  3. Server response handshake
  4. After shaking hands, data can be transferred to each other
  5. The connection ends, sends the close control frame and disconnects TCP

A TCP connection

We won’t go into details about TCP connections, but each language is similar. Server: Listens on TCP ports. Client: initiates A TCP connection

Since this article uses the browser + server format, the connection is initiated by the browser

Handshake protocol

After the TCP connection is connected, the client sends the handshake (a string of lines ending with a \r\n newline):

GET /chat HTTP/1.1 // the HTTP protocol must be 1.1 or higher. Server.example.com // Must Upgrade: websocket // Must, value is case insensitive websocket Connection: Upgrade // Must, the value is case sensitive Upgrade sec-websocket-key: dGhlIHNhbXBsZSBub25jZQ== // Must, the value defined by the client, base64 encoded, used to verify the handshake Origin: http://example.com // Must, all browsers attach, can forge sec-websocket-protocol: Chat, superchat // Optional, the desired communication protocol, sorted by priority sec-websocket-version: 13 // Yes, the webSocket protocol version sec-websocket-Extensions: X-webkit-deflate-frame // Optional, the desired extension protocolCopy the code

Since the client handshake is compatible with HTTP requests, it also conforms to the rules of HTTP requests

Upon receiving the handshake from the incoming client, the server should respond with the handshake (a string, ending with a newline \r\n):

HTTP/1.1 101 Switching Protocols // must, can only return 101, otherwise error Upgrade: websocket // must, value is "websocket" Connection: Upgrade / / must value to "Upgrade" the Sec - WebSocket - Accept: s3pPLMBiTxaQ9kYGzzhZRbK + xOo = / / must, in the client request Sec-websocket-key + 258eafa5-e914-47DA-95ca-c5AB0DC85b11, then base64 sec-websocket-protocol: Chat, superchat // Optional, using sec-websocket-extensions: X-Webkit-deflate-frame // Optional, using extension protocolCopy the code

After accepting the client’s handshake and returning the server’s handshake (with no error), a WebSocket connection is established

The data transfer

Each time the data is transmitted, a frame needs to be constructed and transmitted as a whole (note that the whole frame will be converted into binary data and transmitted to the other end through the TCP connection).

The frame structure

Frame composition: FIN + RSV1 + RSV2 + RSV3 + Opcode + Mask + Payload length + Masking-key + Extension data + Application data

End + three extension codes (RSV1-3)+ opcode + Mask + data length + mask (optional)+ extension (optional)+ data

Therefore, you need to add the above data in front of the data you send. Note that frames sent by the server should not have masks, otherwise an error should be reported

  • FIN: 1bit: indicates whether the last frame is used for sharding
  • Rsv1-3:1bit each. This parameter is specified in the extended protocol. Otherwise, this parameter is invalid
  • Opcode:4bit, used to specify what the frame is used for, according to the value of the control frame and data frame
  • Mask:1bit: indicates whether there is a Mask
  • Payload length:7bit or 7+16bit or 7+64bit.This length is the length converted to binary data), this length is a bit complicated:
    • (1) When the length of the data to be sent is less than 126,7 bits are occupied. In the 7 bits, the data length is directly filled
    • ② When the length is less than 65536(2^16), 7+16 bits are occupied. The first 7 bits are 126(0x7E, when there is no mask), and the last 16 bits are the actual length
    • ③ When the length is greater than or equal to 65536, 7+ bits are occupied. The first 7 bits are 127(0x7F, without mask), and the last 64 bits are the actual length
  • Masking-key:0 or 32bit. If the MASK in the preceding column indicates that there is a MASK, a 4-byte MASK is added. The MASK is defined by the sender and used to encrypt Application data
  • Extension data: 0bit or user-defined number of bytes. Data defined by the Extension protocol. If no Extension protocol is used or defined by the Extension protocol, 0bit
  • Application Data: Information to be sent. If Mask is 1, Masking-key is required for Mask processing

The value in Opcode represents the action of this frame (0-7: data frame 8-F: control frame)

  • 0: subsequent frames, used in sharding
  • 1: text frame, indicating that the sent data is text
  • 2: binary frame, indicating that the data sent is binary
  • 3-7: Reserved data frames, temporarily inactive
  • 8: Close the frame, indicating that the opposite side is about to close the connection
  • 9: Ping frame, the other party ping, you want pong back →_→
  • A: Pong frame. When the other party pings you, you need to return the Pong frame to respond
  • B-f: reserved control frame, temporarily inactive

Frame structure, as above, when to send data, according to the above format, send

Mask encryption and decryption

When sending data that requires mask encryption (as well as decryption), there is a total of 4-byte mask. The rule is as follows

The ith Application data needs to perform xOR operation with the I %4 mask, i.e

Var mask = [0x24,0x48,0xad,0x54] for(var I = 0; iCopy the code

Data fragmentation

When you can't send the data you want in one frame, you can choose to shard the data. Note that the control frame cannot be shard (the data length must not exceed 125).

The method is as follows (if there are only two copies, execute 1,3):

  1. Send the first frame FIN 0,Opcode is the data type of the response
  2. The FIN and Opcode values for sending other fragment frames are 0
  3. Send the last frame with FIN 1 and Opcode 0

Only FIN and Opcode need to be changed, how should I write the rest

Close the frame
  • When receiving a close frame, the control frame should be sent as soon as possible (such as shard), and then respond with a close frame.
  • There may be data inside the close frame that can be used to explain why it is closed and so on, but it is not specified in a human readable language, so it is not necessarily a string
Ping frame
  • When a ping frame is received, a Pong frame should be returned, and since the Ping frame may contain data, the Pong frame should also return with the ping data

So that's the basics of data transfer

Connect the end of the

When the connection no longer needs to exist, it is finished. The basic flow is:

  1. One end sends a close frame
  2. The other end responds with a close frame
  3. Disconnect the TCP

Complete all three steps, but there are special cases

  • One end of the program is closed, and the TCP connection is closed without sending a close frame
  • The program on one end immediately disconnects TCP after sending the close frame, while the program on the other end sends an error when sending the close frame

I have been more pits, so should pay attention to, when the TCP connection error, direct as has been closed Send closed frame if the browser, the server no response, will probably be in 30 to 60 seconds will disconnect the TCP, so will not need to be afraid of the closed frame and don't disconnect the TCP (but if it is to realize the client will pay attention!!! )

Through TCP connection → handshake protocol → data transmission → the end of the connection basically go through a Websocket process

HTTP and Websocket hybrid server simple demo

After reading the above, you will know the basic principle of Websocket. The following demo is optional, but after reading it, maybe it will deepen your understanding. Originally, I wanted to explain it piece by piece, but I found a lot of surplus, so I directly paste the code, and fill back a little comment

var i = 0; Var websocket_pool = new Set(); var history = []; Var Websocket_Http_Util = {// getHeaders: function (headerString) { var header_arr = headerString.split("\r\n"); var headers = {}; for (var i in header_arr) { var tmp = header_arr[i].split(":"); if (tmp.length < 2) continue; headers[tmp[0].trim()] = tmp[1].trim(); } // this part is not standard var first_line = header_arr[0].split(" "); headers.method = first_line[0]; headers.path = first_line[1]; return headers; }, packMessage: function (message) {var message_len = buffer.bytelength (message); var len = message_len > 65535 ? 10 : (message_len > 125 ? 4:2); var buf = new Buffer(message_len + len); buf[0] = 0x81; if (len == 2) { buf[1] = message_len; } else if (len == 4) { buf[1] = 126; buf.writeUInt16BE(message_len, 2); } else { buf[1] = 127; buf.writeUInt32BE(message_len >>> 32, 2); buf.writeUInt32BE(message_len & 0xFFFFFFFF, 6); } buf.write(message, len); return buf; }}; Var http_websocket = require('net').createserver (socket => {// This function is executed when a TCP connection is successfully connected Socket.once ("data", data => {// headers = websocket_http_util.getheaders (data.tostring ())); // Whether the websocket uses rough authentication, this is not formal! if (headers["Sec-WebSocket-Key"]) { //websocket var name = 'socke[' + (i++) + ']'; Var tmpData = new Buffer(0); Var tmpType = null; Var websocketEmitter = new; // The type of shard data to be stored. // The type of shard data to be stored. // The type of shard data to be stored (require('events').EventEmitter)(); Websocket_pool.add (socket); Write ("HTTP/1.1 101 Switching Protocols\r\nUpgrade: websocket\r\nConnection: "HTTP/1.1 101 Switching Protocols\r\nUpgrade: websocket\r\nConnection: Upgrade\r\nSec-WebSocket-Accept:" + require('crypto').createHash('sha1').update(headers["Sec-WebSocket-Key"] + "258EAFA5-E914-47DA-95CA-C5AB0DC85B11").digest('base64') + "\r\n\r\n" ); Socket. On ("data", data => {var index = 2; var isFinish = data[0] >>> 7 == 1; var opcode = data[0] & 15; var len = data[1] & 127; If (len == 126) {len = data.readUIntBE(index, index += 2); } else if (len == 127) { len = data.readUIntBE(index, index += 8); Var mask = (data[1] >>> 7 > 0)? data.slice(index, index += 4) : null; if (mask) { for (var i = index; i < data.length; i++) { data[i] = mask[(i - index) % 4] ^ data[i]; }} // Data = data.slice(index); If (isFinish) {// Message response if (opCode == 8) {// Close the frame response socket.write(new Buffer([0x88, 0x00])); socket.end(); websocketEmitter.emit("end"); } else if (opcode == 9) {// Ping frame response console.log(" Accept ping, from :" + name); socket.write(Buffer.concat([new Buffer([0x8A, data.length]), data])); } else if (opcode = = 0 | | opcode = = 1 | | opcode = = 2) {/ / data response websocketEmitter emit (" data ", Buffer. The concat ([tmpData, data])); tmpData = new Buffer(0); tmpType = null; }} else {// tmpData = buffer. concat([tmpData, data]); if (tmpType === null) tmpType = opcode; }}) websocketemitter. on("data", data => {// receive information console.log(name, data.tostring ()); Var message = websocket_http_util. packMessage(name + ":" + data); for(var s of websocket_pool){ s.write(message); } history.push(name + ":" + data.toString()); if(history.length > 100) history.shift(); }); Websocketemitter. once("end", () => {// Connection is disconnected (bind once) console.log(name, "Connection is disconnected "); Websocket_pool.delete (socket)}); Socket. On (" end ", () = > {the console. The log (disconnected "socket"); websocketEmitter.emit("end"); }); socket.on("error", (err) => { websocketEmitter.emit("end"); })} else {//HTTP is very rough // return the content directly, do not determine its header + request content // here can write more code, to implement the actual HTTP server console.log(" page access "); Socket. Write (" HTTP / 1.1 200 OK \ r \ nserver: Meislzhua \ r \ n \ r \ n "); socket.write("you get path in:" + headers.path); for(var message of history){ socket.write(message+"

\r\n"); } socket.end(); }}); }); // Bind to port 80 http_websocket.listen(80);Copy the code

Browser side, very ugly, just to achieve the function




    
    socket-test
    

    
        #message-box {
            max-height: 300px;
            overflow: auto;
        }
    
    



submit
        
Copy the code