This article mainly introduces webSocket(abbreviated as WS below), and uses Node to implement basic functions natively, the difficulty is mainly to parse and assemble data. Knowledge points needed:
- WebSocket
- Buffer
- Bitwise operators
- Understanding binary
- Know hexadecimal
First let’s look at the WS data frame format:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-------+-+-------------+-------------------------------+ |F|R|R|R| opcode|M| Payload len | Extended payload length | |I|S|S|S| (4) |A| (7) | (16/64) | |N|V|V|V| |S| | (if payload len==126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
| Extended payload length continued, if payload len == 127 |
+ - - - - - - - - - - - - - - - +-------------------------------+
| |Masking-key, if MASK set to 1 |
+-------------------------------+-------------------------------+
| Masking-key (continued) | Payload Data |
+-------------------------------- - - - - - - - - - - - - - - - +
: Payload Data continued ... :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
| Payload Data continued ... |
+---------------------------------------------------------------+
Copy the code
The above diagram is essential to understanding WS, but those unfamiliar with data frames will have no idea what it means. So let’s just explain what this graph is for, and we should look at it.
Data frame
- (bit)
- The smallest unit of data storage in a computer is called b, also known as a bit. Each 0 or 1 is a bit.
- Byte
- Eight bits represent one byte
With these two concepts in mind, look at the picture above:
-
First line (32 bits)
- There’s a FIN in the top left corner of the table, so this is one
position
It’s only going to be 0 or 1 in this bit - RSV1, RSV2, and RSV3 occupy 1 bit, respectively.
- Is then followed by
opcode(4)
Here represents the data operation code, occupying four bits. The value returned is 0000-1111, which is binary - And then the
MASK
Mask identifier, accounting for 1 bit, payload len(7)
, the length of the received data, accounting for 7 bits.Extended payload length(16/54)...
The last space in the first row is eight bits where the meaning of the data will change, more on that later.
- There’s a FIN in the top left corner of the table, so this is one
-
Second line (32 bits)
-
Extended payload Length, if payload len == 127
- In fact, the branch is just for the convenience of display, we could have spliced the second line after the first line, in fact, we do the same when processing data, there is no branch.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-------+-+-------------+-------------------------------+------------------------------------+ |F|R|R|R| opcode|M| Payload len | Extended payload length | | |I|S|S|S| (4) |A| (7) | (16/64) | Extended payload length continued, | |N|V|V|V| | | | |if payload len == 127 | | | | | | |S| | (if payload len==126/127) | | | |1|2|3| |K| | | | +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +------------------------------------+ Copy the code
-
So the next few lines can be spliced to the next.
If the client (browser) sends a Hello to the server, our server receives a binary string of zeros or ones, like this: 10001000111… In order to know exactly what is sent to us, we need to parse the 0/1 of these columns. The diagram above resolves the series of 0/1 rules. We can follow the above rules step by step to get the data we want.
Here’s an example:
If you receive data 10000001 from the client (this is only the first part of the data capture (the first byte), there are many more), the corresponding values are as follows:
FIN | RSV1 | RSV2 | RSV3 | opcode |
---|---|---|---|---|
1 | 0 | 0 | 0 | 0001 |
Data frame format details
- FIN: 1bit
Indicates that this is the last frame of a message. The first frame could also be the last. %x0: next frame %x1: last frame
- RSV1, RSV2, and RSV3: each occupy 1bit
Must be 0 unless an extension has been negotiated to give a non-zero value some meaning. If no non-zero value is defined and a non-zero RSV is received, the Websocket connection fails
- Opcode: 4bit
The following opcode values are defined: %x0: indicates the consecutive frames %x1: Text frame %x2: binary frame %x3-7: reserved for non-control frame %x8: closed handshake frame %x9: Ping frame %xA: Pong frame % xb-F: reserved for non-control frame
- Mask: 1bit
Defines whether “payload data” is added to the mask. If 1 is set, “Masking-key” is assigned and all frames sent from the client to the server are set to 1
- Payload length: 7 bit | 7 + 16 bit | 7 + 64 – bit
If it is 0 to 125, it is the “payload length”. If it is 126, it is the “payload length” of a 16-bit unsigned integer. If it is 127, it is the “payload length” of a 16-bit unsigned integer. And then the next one is represented as a 64-bit unsigned integer, payload length.
- Why do these three things happen? Due to the
payload length
There are only seven, and the second system is maximum1111111
Convert to decimal127
, if payload Length is greater than127
You can’t represent it correctly. We need more bits to say payload length, so we’re inPayload length
I’m going to write it in other bits. Why don’t we just define a 64 bit representation? This works, but there are performance concerns, as mentioned abovehello
Length only “5”, converted to binary is101
Three digits will do, but using 64 bits would be a bit wasteful. So these three cases are defined separately.
- Why do these three things happen? Due to the
- Masking-key: 0 or 32bit
All frames sent from the client to the server contain a 32-bit mask (if “mask bit” is set to 1), or 0 bit otherwise. Once the mask is set, all received payload data must be xor with the value in an algorithm to obtain the true value.
- Payload data: (x+y) bytes
It is the sum of “Extension data” and “Application data”. Generally, the Extension data is empty.
- Extension data: x bytes
This is 0 unless the Extension is defined, and any Extension must specify the length of its Extension data
- Application data: y bytes
Occupy the remaining frames after “Extension data”
In actual combat
Knowing the frame structure and meaning, you can then parse the data according to the rules
- Analytical data
function parseFrams() {
// Buffer received data
const buffer = this.buffer;
// Data starts from the third byte by default and is less than 125 bytes long
let payloadIndex = 2;
// Get the first byte, containing FIN and opcode
const byte1 = buffer.readUInt8(0);
// 0: There are subsequent frames
// 1: last frame
const FIN = (byte1 >>> 7) & 0x1;
// Get the opcode
const opcode = byte1 & 0x0f;
if(! FIN) {// Not the last frame needs to hold the current opcode, the protocol requires:
// The opcode of the first frame must be temporarily saved
// Fragment number 0 1... N-2 N-1
// FIN 0 0 ... 0 1
// opcode ! 0 0... 0 0
this.frameOpcode = opcode;
}
// Get the MASK and the payload length.
let byte2 = buffer.readUInt8(1);
// Defines whether the payload data is added to the mask
// Masking-key is assigned if 1 is set
// All frames sent from the client to the server are set to 1
let MASK = (byte2 >>> 7) & 0x1;
// Get the length of the data
let payloadLength = byte2 & 0x7f;
let mask_key;
if (payloadLength === 126) {
// If the value is greater than 126 and less than 65536, then the following bytes represent the length of the data, then the actual data will be shifted by two bytes
payloadLength = buffer.readUInt16BE(payloadIndex);
// Real data is moved back by 2 bits
payloadIndex += 2;
} else if (payloadLength === 127) {
If the value is greater than or equal to 65536, the following bytes represent the length of the data. The maximum length of the data is 64 bits, but if the data is too large, it is difficult to process. The maximum value is 32 bits
// So bytes 2-6 should always be 0, and real data is 6-10 bytes long
// 4:2-6 byte positions
payloadLength = buffer.readUInt32BE(payloadIndex + 4);
// 8: The data length occupies 8 bytes, the real data needs to be moved 8 bytes later
payloadIndex += 8;
}
// If the MASK bit is set to 1 then Mask_key will occupy 4 bits MASK_KEY_LENGTH===4
const maskKeyLen = MASK ? MASK_KEY_LENGTH : 0;
// If the length of the received data is less than the total length of the sent data plus the length of the protocol header, the data is not completely received and is not processed until all the data is received
if (buffer.length < payloadIndex + maskKeyLen + payloadLength) {
return;
}
// If there is a mask, the real data is preceded by a four-byte mask key (Masking-key)
let payload = Buffer.alloc(0);
if (MASK) {
// Get the mask
mask_key = buffer.slice(payloadIndex, payloadIndex + MASK_KEY_LENGTH);
// The real data is moved back again by 4 bits
payloadIndex += MASK_KEY_LENGTH;
// There is a mask need to decode, decoding algorithm is specified dead, visible text source code
payload = unmask(mask_key, buffer.slice(payloadIndex));
} else {
// We can intercept data without a mask
payload = buffer.slice(payloadIndex);
}
// It may be a fragmented transmission, so you need to cache data frames and wait for all frames to be accepted before processing the complete data
this.payloadFrames = Buffer.concat([this.payloadFrames, payload]);
this.buffer = Buffer.alloc(0);
// Data is accepted
if (FIN) {
const _opcode = opcode || this.frameOpcode;
const payloadFrames = this.payloadFrames.slice(0);
this.payloadFrames = Buffer.alloc(0);
this.frameOpcode = 0;
// Process different data according to different opcodes
this.processPayload(_opcode, payloadFrames); }}Copy the code
- Build returns data, which is the inverse of parsing data
/** * * @param {number} opcode * @param {string|buffer} payload * @param {boolean} isFinal */
function encodeMessage(opcode, payload, isFinal = true) {
const len = payload.length;
let buffer;
let byte1 = (isFinal ? 0x80 : 0x00) | opcode;
if (len < 126) {
// The data contains 0 to 125 characters
// Build the return data container
buffer = Buffer.alloc(2 + len); // 2: [FIN+ rsv1/2/3 +OPCODE](1bytes) + [MASK+payload length](1bytes)
/ / write FIN + RSV1/2/3 + OPCODE
buffer.writeUInt8(byte1);
// Write MASK+payload length from the second byte
buffer.writeUInt8(len, 1);
// Writes real data from the third byte
payload.copy(buffer, 2);
} else if (len < 1 << 16) {
// The value ranges from 126 to 65535
buffer.Buffer.alloc(2 + 2 + len);
buffer.writeUInt8(byte1);
buffer.writeUInt8(126.1);
buffer.writeUInt16(len, 2);
payload.copy(buffer, 4);
} else {
// Data length 65536~..
buffer.Buffer.alloc(2 + 8 + len);
buffer.writeUInt8(byte1);
buffer.writeUInt8(127.1);
buffer.writeUInt32(0.2);
buffer.writeUInt32(len, 6);
payload.copy(buffer, 10);
}
return buffer;
}
Copy the code
The above two pieces of code have very detailed comments, should be able to understand, no longer specific analysis, see github source code