www.blue-zero.com/WebSocket/ — Online test

By accident in zhihu (www.zhihu.com/question/20.) See a reply, instantly feel before so much information are not as good as this reply let me have a deep understanding of websocket. So turn it over to my blog and share it. I like this kind of blog, it is easy to read, not boring, no preachers battle, just for sharing. Nonsense so much, finally praise a ~

Websocket vs. HTTP

WebSocket is a product of HTML5, which means HTTP doesn’t change, or doesn’t matter, but HTTP doesn’t support persistent connections (long connections, circular connections don’t count).

HTTP 1.1 and 1.0, also known as keep-alive, combine multiple HTTP requests into one, but Websocket is actually a new protocol that has nothing to do with HTTP, just to be compatible with existing browser handshake specifications. In other words, it’s a supplement to the HTTP protocol and you can see it in this diagram

There’s some overlap, but not all of it.

Html5 also refers to a set of new apis, or new specifications, new technologies. The Http protocol itself is only 1.0 and 1.1 and is not directly related to Html itself. In layman’s terms, you can use HTTP to transfer non-HTML data, and that’s it. =

Again, the hierarchy is different.

Second, Websocket is what kind of protocol, specific advantages

First, Websocket is a persistent protocol, as opposed to HTTP, which is not persistent. For a simple example, use the PHP life cycle, which is widely used today.

The HTTP lifecycle is defined by a Request, i.e., a Request and a Response. In HTTP1.0, the HTTP Request ends.

Improvements were made in HTTP1.1 to have a keep-alive, that is, multiple requests can be sent and multiple responses received within an HTTP connection. But remember that Request = Response, which is always the case in HTTP, means that there can only be one Response per Request. Moreover, this response is also passive and cannot be initiated actively.

Coach, you BB so much, what does Websocket have to do with? _(:з “Angle)_ WELL, I was about to say Websocket.

First of all, Websocket is based on HTTP, or borrows HTTP protocol to do part of the handshake.

First let’s look at a typical Websocket handshake.

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Origin: http://example.com
Copy the code

Those familiar with HTTP may have noticed a few things missing from this HTTP-like handshake request. And I’ll talk about that in passing.

Upgrade: websocket
Connection: Upgrade
Copy the code

This is the core of Websocket, tell Apache, Nginx and other servers: attention, I initiated the Websocket protocol, quickly help me find the corresponding assistant processing ~ not that old HTTP.

Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Copy the code

First of all, sec-websocket-key is a Base64 encode value, which the browser randomly generates to tell the server: Peat, don’t be silly, I’m going to verify that Ni is really a WebSocket assistant.

Sec_websocket-protocol, however, is a user-defined string used to distinguish between protocols required by different services under the URL. I’m going to serve A tonight. Make no mistake

Finally, sec-websocket-version tells the server to use the WebSocket Draft (protocol Version). At the beginning, the WebSocket protocol is still in the Draft stage, there are all kinds of weird protocols, and there are many weird and different things. Firefox and Chrome are not using the same version of the Websocket protocol was a big problem. Dehydration: Waiter, I want a 13-year-old oh →_→

The server then returns the following message indicating that the request was received and the Websocket was successfully established.

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat
Copy the code

This is the final area of HTTP to tell the client that I have successfully switched protocols

Upgrade: websocket
Connection: Upgrade
Copy the code

Still fixed, the Websocket protocol tells the client about the upcoming upgrade, not mozillasocket, lurnarsocket, or shitsocket.

Sec-websocket-accept is an encrypted sec-websocket-key that is confirmed by the server. Server: good good, know, show you my ID CARD to prove the line.

Sec-websocket-protocol is the final Protocol used.

At this point, HTTP has done all of its work, and now it’s all Websocket protocol. The details of the agreement are not explained here.

—————— Technical analysis section is complete ——————

You’ve been BBB for so long, what the hell is the use of Websockets when HTTP long polling or Ajax polling can deliver real-time information?

All right, young man, let’s talk about websockets. Let’s get you some carrot.

Third, the role of Websocket

Before I talk about Websockets, I’ll take a look at long Polling and Ajax polling in passing.

Ajax polling

Ajax polling works very simply by having the browser send a request every few seconds to ask the server if there is new information.

Scene representation:

Client: La la la, any new information (Request)

Server: None (Response)

Client: La la la, any new information (Request)

Server: No. (Response)

Client: La la la, any new information (Request)

Server: : You are boring, no ah. (Response)

Client: La la la, any new messages (Request)

Server: All right, all right, here you go. (Response)

Client: La la la, any new messages (Request)

Server:… No… No… No (Response) — loop

long poll

In fact, the principle of Long Poll is similar to that of Ajax polling, which adopts the polling method, but adopts the blocking model (keep calling and do not hang up if you do not receive the call). That is to say, after the client initiates a connection, if there is no message, it does not return the Response to the client. It does not return until there is a message, after which the client establishes the connection again and the cycle starts again.

Scene representation:

Client: La la la la, do you have any new information, if not, please return to me when you have (Request)

Server: eh.. Wait till there is news. To give you (Response)

Client: la la la, do you have any new information, no words, and so on have to return it to me (Request) – loop

It can be seen from the above that in fact, the two methods are constantly establishing HTTP connections and waiting for the server to process them, which can reflect another feature of HTTP protocol, passivity.

What is passive, in fact, is that the server cannot actively contact the client, only the client initiates.

To put it simply, the server is a lazy refrigerator (this is a joke) (can’t, can’t initiate a connection), but the boss has orders that if a customer comes, no matter how tired they are, they should be welcomed.

With that said, let’s talk about the above defects (forgive me for talking so much OAQ).

It’s easy to see from the above, however, that both of the above are very resource-intensive.

Ajax polling requires fast processing speed and resources on the server. (Speed) Long poll requires high concurrency, that is, the ability to receive customers at the same time. (Site size)

So this can happen with Ajax polling and long polling.

Client: la la la la, any new information?

Server: The monthly line is busy, please try again later (503 Server Unavailable)

Client:… All right, blah, blah, blah, anything new?

Server: The monthly line is busy, please try again later (503 Server Unavailable)

Client: And the server is busy: fridges, I want more fridges! More.. More.. I was wrong. That’s another meme.

Anyway, let’s talk about Websocket

From the above example, we can see that neither approach is the best approach and requires a lot of resources.

One needs more speed and one needs more ‘phones’. Both of these will lead to an increasing demand for ‘telephones’.

Oh, I forgot to mention that HTTP is a stateful protocol.

Generally speaking, the server is a forgetful ghost because it has to receive too many clients every day. As soon as you hang up the phone, he will forget all your things and throw away all your things. You have to tell the server again the second time.

So in this case, Websocket appears. He solved these problems with HTTP. First, passivity, when the server completes the protocol upgrade (HTTP->Websocket), the server can actively push information to the client. So the above scenario can be modified as follows.

Client: la la la, I want to establish the Websocket protocol, the required service: chat, Websocket protocol version: 17 (HTTP Request)

Server: OK, confirm, upgraded to Websocket (HTTP Protocols Switched)

Client: : Please push the information to me when you have it.

Server: OK, I’ll tell you sometime.

Server: Balabalabalabala

Server: Balabalabalabala

Server side: ha ha ha ha ha ha ha ha

Server side: Lmao ha ha ha ha ha ha

The result is an endless stream of information that can be sent through a single HTTP request. (In programming, this is called a callback, meaning, “Come back to me when you have information, instead of me coming up to you every time I’m stupid.”)

Such a protocol solves the problem of synchronization being delayed and very resource-intensive. So why does he solve the problem of consuming resources on the server?

In fact, the program we use is through two layers of proxy, that is, HTTP protocol in the Nginx server parsing, and then passed to the corresponding Handler (PHP, etc.) for processing. To put it simply, we have a very fast operator (Nginx) who is responsible for forwarding problems to the appropriate customer service (Handler).

The speed of the operator itself is basically enough, but every time it gets stuck in the customer service (Handler), there are always customer service slow processing. , resulting in insufficient customer service. Websocket solves such a problem. After it is established, it can directly establish a persistent connection with the operator. When there is information, the customer service will find a way to inform the operator, and then the operator will deliver it to the customer.

This will solve the problem of slow customer service processing.

At the same time, in the traditional way, HTTP is constantly set up and closed. Since HTTP is stateless, identity Info is retransmitted each time to tell the server who you are.

Although the operator is very fast, it is inefficient to listen to such a large amount of information every time, and at the same time have to constantly forward the information to the customer service, not only waste the customer service processing time, but also consume too much traffic/time in the network transmission.

But Websocket only needs one HTTP handshake, so the entire communication process is set up in one connection/state, which avoids HTTP statelessness. The server will know your information until you close the request, thus eliminating the need for the operator to parse the HTTP protocol repeatedly. Also check identity Info for information.

At the same time by the customer to take the initiative to ask, converted to the server (push) when there is information to send (of course, the client or take the initiative to send information over. When no information is available, it is handed to the operator (Nginx), without using the inherently slow customer service (Handler)

— — — –

How w to use Websocket on clients that do not support Websocket. The answer is: no

But you can simulate a similar effect with the long poll and Ajax polling described above

— — — –

Content to self-knowledge on: www.zhihu.com/question/20…

 

— — — — — — — — — — — — — — — — — — — — — — — — — — — —

WebSocket: 5 minutes from beginner to master

I. Content overview

WebSocket makes the browser have the ability of real-time bidirectional communication. This article goes through the details of how WebSockets establish connections, exchange data, and format data frames. It also gives a brief introduction to security attacks against Websockets and how the protocol protects against such attacks.

What is WebSocket

HTML5 began to provide a browser and server for full duplex communication network technology, belongs to the application layer protocol. It is based on the TCP transport protocol and reuses the HTTP handshake channel.

This description is a bit boring for most Web developers, but just keep a few things in mind:

  1. WebSocket can be used in the browser
  2. Support for two-way communication
  3. It’s easy to use

1. What are the advantages

In terms of advantages, the comparison here is HTTP, which in a nutshell supports two-way communication, is more flexible, more efficient, and has better scalability.

  1. Support two-way communication, more real-time.
  2. Better binary support.
  3. Less control overhead. After the connection is created, when the WS client and server exchange data, the packet header controlled by the protocol is small. Without the header, the client-to-client header is only 2 to 10 bytes (depending on the packet length), with an additional 4-byte mask for client-to-server. HTTP requires a complete header for each communication.
  4. Support for extensions. The WS protocol defines extensions, and users can extend the protocol or implement custom subprotocols. (such as support for custom compression algorithms)

For the latter two points, students who have not studied the WebSocket protocol specification may not understand it intuitively, but it does not affect the learning and use of WebSocket.

2. What you need to learn

For network application layer protocol learning, the most important is often the connection establishment process, data exchange tutorial. Of course, the format of the data is unavoidable, as it directly determines the capabilities of the protocol itself. Good data formats make protocols more efficient and scalable.

The following points are mainly discussed below:

  1. How to Establish a connection
  2. How to exchange data
  3. Data frame format
  4. How to maintain connections

3. Examples for getting started

Before going into the details of the protocol, let’s take a look at a simple example to get a feel for it. Examples include WebSocket server and WebSocket client (web page). The full code can be found here.

Here the server uses the WS library. The WS implementation is lighter and more suitable for learning purposes than the familiar socket. IO.

1. Server

The code is as follows: listen on port 8080. When a new connection request arrives, a log is printed and a message is sent to the client. Logs are also generated when a message is received from the client.

 

123456789101112131415161718192021 var app = require(‘express’)(); var server = require(‘http’).Server(app); var WebSocket = require(‘ws’); var wss = new WebSocket.Server({ port: 8080 }); wss.on(‘connection’, function connection(ws) { console.log(‘server: receive connection.’); ws.on(‘message’, function incoming(message) { console.log(‘server: received: %s’, message); }); ws.send(‘world’); }); app.get(‘/’, function (req, res) { res.sendfile(__dirname + ‘/index.html’); }); app.listen(3000);

 

2. Client

Initiate a WebSocket connection to port 8080. After the connection is established, logs are generated and messages are sent to the server. Logs are also generated after receiving messages from the server.

 

1  

 

3. Running results

You can view server logs and client logs separately.

Server output:

 

 

12 server: receive connection.server: received hello

Client output:

 

12 client: ws connection is openclient: received world

 

How to establish a connection

As mentioned earlier, WebSockets reuse the HTTP handshake channel. Specifically, the client negotiates the upgrade protocol with the WebSocket server through HTTP requests. After the protocol upgrade, the subsequent data exchange follows the WebSocket protocol.

1. Client: Apply for the protocol upgrade

First, the client initiates a protocol upgrade request. As you can see, the standard HTTP packet format is adopted and only the GET method is supported.

 

1234567 GET/HTTP / 1.1 Host: localhost: 8080 origin:http://127.0.0.1:3000Connection: UpgradeUpgrade: websocketSec-WebSocket-Version: 13Sec-WebSocket-Key: w4v7O6xFTi36lq3RNcgctw==

The significance of the key request header is as follows:

  • Connection: Upgrade: indicates that the protocol needs to be upgraded
  • Upgrade: websocket: indicates that the webSocket protocol is to be upgraded.
  • Sec-WebSocket-Version: 13: Indicates the websocket version. If the server does not support this version, you need to return oneSec-WebSocket-VersionHeader, which contains the version number supported by the server.
  • Sec-WebSocket-Key: and the following server response headerSec-WebSocket-AcceptIt is compatible and provides basic protection, such as malicious or unintentional connections.

Note that the above request omits part of the unfocused request header. Since it is a standard HTTP request, the headers of requests such as Host, Origin, and Cookie are sent as usual. During the handshake phase, security restrictions and permission verification can be performed through the relevant request headers.

2, server: response protocol upgrade

The status code 101 indicates protocol switchover. The protocol upgrade is completed here, and subsequent data interaction is based on the new protocol.

 

1234 HTTP / 1.1 101 Switching ProtocolsConnection: UpgradeUpgrade: websocketSec WebSocket – Accept: Oy4NRAQ13jhfONC7bP8dTKb4PTU =

 

Note: Each header ends with \r\n and an extra blank line \r\n is added to the last line. In addition, the HTTP status code that the server responds to can only be used during the handshake phase. After the handshake phase, only specific error codes can be used.

3. Calculation of sec-websocket-accept

Sec-websocket-accept Is calculated based on the sec-websocket-key in the header of the client request.

The calculation formula is:

  1. willSec-WebSocket-Keywith258EAFA5-E914-47DA-95CA-C5AB0DC85B11Joining together.
  2. The digest is computed by SHA1 and converted to a Base64 string.

The pseudocode is as follows:

 

1 >toBase64( sha1( Sec-WebSocket-Key + 258EAFA5-E914-47DA-95CA-C5AB0DC85B11 )  )

Verify the previous result:

 

12345678910 const crypto = require(‘crypto’); const magic = ‘258EAFA5-E914-47DA-95CA-C5AB0DC85B11’; const secWebSocketKey = ‘w4v7O6xFTi36lq3RNcgctw==’; let secWebSocketAccept = crypto.createHash(‘sha1’) .update(secWebSocketKey + magic) .digest(‘base64’); console.log(secWebSocketAccept); // Oy4NRAQ13jhfONC7bP8dTKb4PTU=

 

5. Data frame format

The data exchange between client and server is inseparable from the definition of data frame format. So, before we actually talk about data exchange, let’s take a look at the data frame format of WebSocket.

The minimum unit of communication between WebSocket client and server is frame, which consists of one or more frames to form a complete message.

  1. Sender: the message is cut into multiple frames and sent to the server;
  2. Receiver: Receives message frames and reassembles the associated frames into complete messages;

The focus of this section is to explain the format of data frames. Refer to section 5.2 of RFC6455 for detailed definitions.

1. Overview of data frame format

A unified format for WebSocket data frames is given below. Those of you familiar with TCP/IP are familiar with this diagram.

  1. From left to right, in bits. Such asFIN,RSV1One bit each,opcodeIt takes 4 bits.
  2. The content includes identification, operation code, mask, data, and data length. (Expanded in the next section)

 

 

123456789101112131415161718 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1+-+-+-+-+——-+-+————-+——————————-+|F|R|R|R| opcode|M| Payload len | Extended payload length ||I|S|S|S| (4) |A| (7) | (16/64) ||N|V|V|V| |S| | (if payload len==126/127) || |1|2|3| |K| | |+-+-+-+-+——-+-+————-+ – – – – – – – – – – – – – – – +| Extended payload length continued, if payload len == 127 |+ – – – – – – – – – – – – – – – +——————————-+| |Masking-key, if MASK set to 1 |+——————————-+——————————-+| Masking-key (continued) | Payload Data |+——————————– – – – – – – – – – – – – – – – +: Payload Data continued … :+ – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – – +| Payload Data continued … |+—————————————————————+

 

2. Detailed explanation of data frame format

Based on the previous format overview diagram, this section explains each field one by one. If there is any ambiguity, please refer to the protocol specification or message exchange.

FIN: 1 bit.

The value is 1, indicating that it was the last fragment of message. The value is 0, indicating that it was not the last fragment of message.

RSV1, RSV2, RSV3: each occupies one bit.

In general, they’re all 0’s. When the client and server negotiate to use WebSocket extension, the three flag bits can be non-0, and the meaning of the value is defined by the extension. If a non-zero value is present and the WebSocket extension is not used, the connection fails.

Opcode: 4 bits.

The value of Opcode determines how subsequent data payloads should be resolved. If the operation code is unknown, the receiver should fail the connection. Optional operation codes are as follows:

  • %x0: indicates a continuation frame. When Opcode is 0, data shards are used in data transmission, and the received data frame is one of the data shards.
  • %x1: indicates a text frame.
  • %x2: indicates a binary frame.
  • %x3-7: Reserved operation code for later defined non-control frames.
  • %x8: The connection is down.
  • %x9: indicates a ping operation.
  • %xA: indicates this is a PONG operation.
  • % xb-f: Reserved operation code for subsequent defined control frames.

Mask: 1 bit.

Indicates whether to mask the data payload. When sending data from the client to the server, mask the data. When sending data from the server to the client, there is no need to mask the data.

If the data received by the server has not been masked, the server needs to disconnect the data.

If Mask is 1, Masking key is defined in Masking-key and used to Mask the data payload. Mask 1 is used for all data frames sent by the client to the server.

The algorithm and usage of the mask are explained in the next section.

Payload Length: Indicates the length of the data Payload, in bytes. It is 7 bits, or 7+16 bits, or 1+64 bits.

Payload length === x

  • If x ranges from 0 to 126, the length of data is x bytes.
  • X is 126: the next two bytes represent a 16-bit unsigned integer whose value is the length of the data.
  • X is 127: The next 8 bytes represent a 64-bit unsigned integer (highest bit 0) whose value is the length of the data.

In addition, if the payload length occupies more than one byte, the binary representation of the payload length is big endian.

Masking-key: 0 or 4 bytes (32 bits)

All data frames transmitted from the client to the server are masked with Mask 1 and 4-byte Masking key. If the Mask is 0, there is no Masking-key.

Note: Payload data length, excluding mask key length.

Payload data :(x+y) bytes

Load data: includes extended data and application data. Where, the extension data is x bytes, and the application data is Y bytes.

Extended data: 0 bytes of extended data if no extension is negotiated. All extensions must declare the length of the extended data, or how the length of the extended data can be calculated. In addition, how the extension will be used must be negotiated during the handshake phase. If the extended data exists, the payload data length must include the length of the extended data.

Application data: Any application data, after the extension data (if any), occupies the remaining space of the data frame. The length of the application data is obtained by subtracting the payload data length from the extension data length.

3. Mask algorithm

Masking-key is a random 32-bit number selected by the client. The mask operation does not affect the length of the data payload. The following algorithms are used for mask and inverse mask operations:

First, assume:

  • Original-octet-i: indicates the I th byte of the original data.
  • Transformed -octet -I: indicates the i-th byte of the transformed data.
  • J:i mod 4Results.
  • Masking -key-octet-j: indicates the JTH byte of the mask key.

The algorithm is described as original-octet-i and masking-key-octet-j, and then transformed- OCtet-i is obtained.

j = i MOD 4

transformed-octet-i = original-octet-i XOR masking-key-octet-j

6. Data transmission

Once the WebSocket client and server establish a connection, subsequent operations are based on the transmission of data frames.

WebSocket distinguishes operation types based on Opcode. For example, 0x8 indicates disconnection, and 0x0-0x2 indicates data interaction.

1. Data sharding

Each WebSocket message may be split into multiple data frames. When the WebSocket receiver receives a data frame, it determines whether the last data frame of the message has been received according to the VALUE of the FIN.

FIN=1 indicates that the current data frame is the last data frame of the message. In this case, the receiver has received the complete message and can process the message. If FIN=0, the receiver needs to continue listening to receive other data frames.

In addition, opCode represents the type of data in the case of data exchange. 0x01 indicates text, and 0x02 indicates binary. 0x00 is special and represents a continuation frame, which, as the name suggests, means that the data frame corresponding to the complete message has not been received.

2. Examples of data sharding

It’s better to look at examples. The following example from MDN is a good example of data sharding. The client sends messages to the server twice. The server responds to the client after receiving the messages. This section describes the messages sent from the client to the server.

First message

FIN=1, indicating the last data frame of the current message. Once the server receives the current data frame, it can process the message. Opcode =0x1: indicates that the client sends a text message.

Second message

  1. FIN=0, opCode =0x1, indicating that the message type is text, and the message has not been sent yet, and there are subsequent data frames.
  2. FIN=0, opCode =0x0, indicating that the message has not been sent yet and there are subsequent data frames. The current data frame must be followed by the previous data frame.
  3. FIN=1, opCode =0x0: Indicates that the message has been sent and no subsequent data frame is displayed. The current data frame must be followed by the previous one. The server can assemble the associated data frames into complete messages.

 

 

12345678 Client: FIN=1, opcode=0x1, msg=”hello”Server: (process complete message immediately) Hi.Client: FIN=0, opcode=0x1, msg=”and a”Server: (listening, new message containing text started)Client: FIN=0, opcode=0x0, msg=”happy new”Server: (listening, payload concatenated to previous message)Client: FIN=1, opcode=0x0, msg=”year!” Server: (process complete message) Happy new year to you too!

 

Seven, maintain the connection + heartbeat

WebSocket To maintain real-time bidirectional communication between the client and server, ensure that the TCP channel between the client and server is not disconnected. However, if a connection is maintained for a long time without data exchange, the connection resources may be wasted.

However, in some scenarios, the client and server need to be connected even though no data has been exchanged for a long time. At this point, a heartbeat can be used to achieve this.

  • Sender -> Receiver: ping
  • Recipient -> Sender: Pong

Ping and pong operations correspond to two control frames of WebSocket with opcode 0x9 and 0xA respectively.

For example, a WebSocket server can ping a client using the following code (using the WS module)

 

1 ws.ping(”, false, true);

 

Sec-websocket-key /Accept

As mentioned earlier, sec-websocket-key/sec-websocket-Accept is used to provide basic protection against malicious and unexpected connections.

The functions are summarized as follows:

  1. Prevent the server from receiving illegal WebSocket connections (for example, if an HTTP client accidentally requests to connect to the WebSocket service, the server can directly reject the connection)
  2. Make sure the server understands websocket connections. Since the WS handshake phase uses HTTP, it is possible that the WS connection is processed and returned by an HTTP server, in which case the client can use sec-websocket-key to ensure that the server is aware of the WS protocol. (Not 100% safe, there are always boring HTTP servers, light sec-websocket-key, but no WS protocol…)
  3. Sec-websocket-key and other related headers are disallowed when setting headers for ajax requests in the browser. This avoids websocket upgrade when the client sends ajax requests.
  4. This prevents the reverse proxy (which does not understand the WS protocol) from returning incorrect data. For example, when the reverse proxy receives two ws connection upgrade requests, it returns the first one to the cache, and then returns the second one directly to the cache (meaningless return).
  5. The main purpose of SEC-websocket-key is not to ensure data security, because the calculation formula of sec-websocket-key and SEC-websocket-accept conversion is public and very simple, and the main function is to prevent some common accidents (unintentional).

Note: the conversion of SEC-websocket-key/SEC-websocket-Accept can only bring basic guarantee, but there is no actual guarantee whether the connection is safe, whether the data is safe, whether the client/server is legitimate WS client, WS server.

Ix. Functions of data masks

In the WebSocket protocol, the data mask enhances the security of the protocol. However, the data mask is not to protect the data itself, because the algorithm itself is public and the operation is not complicated. There don’t seem to be many effective ways to secure communications other than encrypting the channel itself.

So why is it necessary to introduce the mask calculation? It seems that there is not much benefit except increasing the amount of computation of the computing machine (which is also the point of confusion of many students).

The answer is in two words: safety. Not to prevent data leaks, but rather to prevent proxy cache contamination attacks and other issues that existed in earlier versions of the protocol.

1. Proxy cache contamination attacks

Here’s an excerpt from a 2010 speech on security. It mentioned security issues that could result from a flaw in the proxy server’s protocol implementation. Bash the source.

“We show, empirically, that the current version of the WebSocket consent mechanism is vulnerable to proxy cache poisoning attacks. Even though the WebSocket handshake is based on HTTP, which should be understood by most network intermediaries, The handshake uses the esoteric “Upgrade” mechanism of HTTP [5]. In our experiment, we find that many proxies do not implement the Upgrade mechanism properly, which causes the handshake to succeed even though subsequent traffic over the socket will be misinterpreted by the [TALKING] Huang, L-S., Chen, E., Barth, A., Rescorla, E., and C.

 

 

1           Jackson, “Talking to Yourself for Fun and Profit”, 2010,

Before formally describing the attack steps, we assume the following actors:

  • Attackers, servers controlled by attackers themselves (referred to as “evil servers”), forged resources by attackers (referred to as “evil resources”)
  • Victims, resources that victims want to access (” justice resources “)
  • The server that the victim actually wants to access (” Justice server “for short)
  • Intermediate proxy server

Attack Step 1:

  1. The attacker browser initiates a WebSocket connection to the evil server. According to the previous section, the first is a protocol upgrade request.
  2. The protocol upgrade request actually reaches the proxy server.
  3. The proxy server forwards protocol upgrade requests to the evil server.
  4. The evil server agrees to the connection, and the proxy server forwards the response to the attacker.

Because of a bug in the upgrade implementation, the proxy server thought it was forwarding plain HTTP messages. Therefore, when the protocol server agrees to the connection, the proxy server assumes that the session has ended.

Attack Step 2:

  1. The attacker sends data to the nefarious server over the WebSocket interface from the previously established connection, and the data is carefully constructed HTTP text. This contains the address of the justice resource and a fake host (pointing to the justice server). (See the following message)
  2. The request arrives at the proxy server. Although the previous TCP connection was reused, the proxy server thought it was a new HTTP request.
  3. A proxy server requests an evil resource from an evil server.
  4. Evil server returns evil resources. Proxy servers cache evil resources (url is correct, but host is the address of the good server).

Here’s where the victim comes in:

  1. The victim accesses justice resources on the Justice server through a proxy server.
  2. The proxy server checks the url and host of the resource and finds a local cache (forged).
  3. Proxy servers return evil resources to victims.
  4. Victim pawns.

Attached: the carefully constructed “HTTP request message” mentioned earlier.

 

12345 The Client and Server: POST/path/of HTTP / 1.1 / attackers/choice Host: Host-of-attackers-choice.com sec-websocket-key :Server → Client:HTTP/1.1 200 oksec-websocket-Accept:

 

2. Current solution

The original proposal was to encrypt the data. Considering security and efficiency, a compromise scheme is finally adopted: mask processing for data load.

It should be noted that the browser is only limited to the data payload mask processing, but the bad guys can fully realize their own WebSocket client, server, not according to the rules, the attack can be carried out as usual.

But by putting this restriction on browsers, you can make the attack much more difficult and reach. If there is no such limitation, you just need to put a phishing website on the Internet to deceive people to visit, and you can launch a large-scale attack in a short time.

Write in the back

There are a lot of things WebSocket can write, such as WebSocket extensions. How clients and servers negotiate and use extensions. WebSocket extension can add a lot of capabilities and imagination to the protocol itself, such as data compression, encryption, and multiplexing.

Space is limited, here not to expand, interested students can leave a message exchange. Please point out any mistakes or omissions in the article.

11. Related links

RFC6455: WebSocket specification tools.ietf.org/html/r…

Specification: Data frame mask details tools.ietf.org/html/r…

Specification: Data frame format tools.ietf.org/html/r…

Server – example github.com/websockets…

Compile websocket server developer.mozilla.org…

Attacks on network infrastructure (something to be prevented by data mask operations) tools.ietf.org/html/r…

Talking to Yourself for Fun and Profit w2spconf.com/2011/pape…

What is Sec-WebSocket-Key for? Stackoverflow.com/que…

10.3. Attacks On Infrastructure (Masking) tools.ietf.org/html/r…

Talking to Yourself for Fun and Profit w2spconf.com/2011/pape…

Why are WebSockets masked? Stackoverflow.com/que…

How does websocket frame masking protect against cache poisoning? Security. Stackexchang…

What is the mask in a WebSocket frame? Stackoverflow.com/que…