Http protocol pain points
1. HTTP scalability
In the old architecture, HTTP-1.0 was sufficient for requesting a single document from the server, but as the Web evolved, pages began to contain more interactivity and shortened browser request and server response times.
In HTTP-1.0, each server request required a separate connection, which did not scale well.
The next revision of HTTP, HTTP-1.1, adds reusable connections. Http-1.1 reduces request latency by reducing the number of client-to-server connections.
HTTP is stateless
Stateless protocols have some advantages in that each request is unique and independent, and the server does not need to store information about the session, thereby eliminating the need to store data, greatly saving space.
However, this also means that redundant information about the request is sent in each HTTP request and response to tell the server its “identity information,” as shown in Figure 1 below, which is not required for the Websocket link in Figure 2.
HTTP is not “addressable”
The client can find the server resource through the URL, but the server application has no URL-like identity to actively find the client and send the resource, which makes web communication asymmetric. Reverse the notification process: One way around this limitation is for the client to issue an HTTP request that reverses the notification process, represented by an umbrella term, 'Comet.' (Comet is essentially polling, long polling, HTTP streaming)
Polling (polling) : A timed synchronous call in which the client sends a request to the server to see if new information is available. Requests are made at regular intervals, with or without information, and the client gets a response: if information is available, the server sends it; If no information is available, the server returns a rejection response and the client closes the connection.
const xhr = new XMLHttpRequest(); xhr.open('GET', 'http://www.xxxx.com', true); const timer = setinterval(() => { xhr.send(); }, 100000) xhr.onreadystatechange = function (e) { if (xhr.readyState == 4 && xhr.status == 200) { if(xhr.responseText) { res = xhr.responseText; clearInterval(timer); }}}Copy the code
Cons: It's important to know the exact interval in advance, but real-time data is unpredictable and makes a lot of unnecessary requests
Long polling: Another popular method of communication, in which the client requests information from the server and opens a connection for a set period of time. If the server has no information, it keeps the request open until information is available to the client or until the specified timeout expires. At this point, the client rerequests information from the server.
Long polling is also called reverse AJAX. The technique of extending the completion of an HTTP response until the server has something to send to the client is often referred to as “pending GET” or “pending POST.” It is important to know that long polling has no significant performance advantage over traditional polling when the server data is updated quickly, because the client must frequently reconnect to the server to read new information, resulting in network performance that is no worse than polling. Another problem with long polling is the lack of a standard implementation.
Disadvantages: Similar to polling, unnecessary requests and lack of standard implementation.
HTTP streaming: In streaming, the client sends a request, and the server sends and maintains an open response that is constantly updated and kept open (either indefinitely or for a specified period of time). Whenever the server has information it needs to deliver to the client, it updates the response.
Streaming is a great solution to accommodate unpredictable information delivery, but the server never makes a request to complete the HTTP response, leaving the connection open. In this case, proxies and firewalls may cache responses, resulting in increased delays in information delivery. Therefore, many fluidization attempts are unfriendly to networks with firewalls and proxies.
Disadvantages: Special circumstances can lead to increased latency
Based on Flash: AdobeFlash through its Socket to complete the data exchange, using Flash to expose the corresponding interface to JavaScript call, so as to achieve the purpose of real-time transmission. This method is more efficient than polling, and because of the high rate of Flash installation, a wide range of application scenarios. However, Flash is not well supported on mobile Internet terminals: IOS does not support Flash, and Android does support Flash, but the actual use effect is not satisfactory, and mobile devices have high requirements for hardware configuration. In 2012, Adobe officially announced that it would no longer support Android4.1+, declaring Flash’s death on mobile devices.
Of course, these are all solutions to HTTP-1.0 and HTTP-1.1, which already implement server push by creating a “stream” (a TCP connection that creates multiple streams to distinguish different requests).
The birth of the WebSocket
Developed by the Internet Engineering Task Force (IETF), it is a natural full-duplex, bidirectional, single-socket link.
advantage
1. Reduce latency: Unlike polling, WebSocket makes only one request. The server does not need to wait for requests from clients. Similarly, a client can send a message to the server at any time. A single request greatly reduces latency compared to polling, which sends a request every once in a while regardless of whether a message is available.
2. Simplicity: Multiple communications at a time. Unlike HTTP, websockets have no 'context' for each request. To maintain this' context ', HTTP has to carry a 'huge' header for each request and response.
3.Server push: as shown below
Communication mechanism
Link building
Like TCP, which requires a handshake, each WebSocket connection starts with an HTTP request. This request is similar to other requests, but contains a special prefix — Upgrade. The Upgrade header indicates that the client will Upgrade the connection to a different protocol. This different protocol is called WebSocke.
101 indicates that the connection is successful. In addition to WS, WSS and HTTP are similar to those of HTTP and HTTPS, which use TLS on top of TCP/IP.
Protocol negotiation – Calculates the response key value
A websocket server must respond quickly to a key in order to secure a handshake, and failure to respond may trick some credulous HTTP server into accidentally upgrading a connection. The response process is as follows: The server’s response function obtains the Key value from the SEC-websocket-key header sent by the client and returns the Key value calculated according to the client’s expectation in the SEC-websocket-accept header.
Const KEY_SUFFIX = '258eafa5-e914-47DA-95ca-C5AB0dc85b11 '; // const KEY_SUFFIX =' 258eafa5-e914-47DA-95ca-C5AB0dc85b11 '; Const hashWebSocketKey = function (key) {// crypto. CreateHash returns a hash object sha1 encryption algorithm const sha1 = crypto.createHash("sha1"); // Update hash sha1.update(key + KEY_SUFFIX, 'ASCII ') based on the given data; return sha1.digest('base64'); }Copy the code
Other RFC 6455 related Sec- header, RFC specification
To transmit data
1. Message Format - Protocol frames
When the WS connection is established, the client and server can send data to each other at any time. What is the format of the data?
The minimum unit of A WS message is a frame, which is generally expressed in binary. In the early drafts of protocol frames’ frame ‘was’ message’ and the two were interchangeable, so called because there was only one frame in a message, but actually a message could consist of multiple frames.
Protocol frame: consists of control frame and data frame, as shown in the following figure
- FIN: 1 bit: indicates whether this is the last frame of the message. The first frame could be the last frame. -rsv1, RSV2, RSV3:1 bit, extended field, must be 0 unless an extension has been negotiated to give some meaning to a non-zero value: 4 bits: explains the type of payload data. If an unrecognized Opcode is received, the payload is disconnected. 0x0: consecutive frame 0x1: text frame 0x2: binary frame 0x3-7: Reserved for non-control frame 0x8: closed handshake frame 0x9: Ping frame 0xA: pong frame 0xb-f: Reserved for non-control frame -mask: 1 bit: identifies whether the Payload data is processed by the mask. If it is 1, the data in the Masking-key field is the mask key and is used to decode the Payload data. The protocol specifies that the client data needs to be masked. Therefore, this bit is 1-payload len: 16 bit 7 + 7 bit | | 7 + 64 - bit, said the "Payload data Payload data", in bytes If it is 0 ~ 125, then directly said the payload length If it is 126, So I'm going to store 0x7E (=126) and then the next two bytes of the 16-bit unsigned integer is the payload if it's 127, 0x7E (=126) = payload length = payload length = payload data Bytes, which is the sum of Extension data and Application data. Generally, the Extension data is empty. -extension data: x bytes, 0 unless the Extension is defined. -application data: y bytes, occupying all space after Extension dataCopy the code
Disconnect the link
A Websocket connection can be closed at any time, but not always for ws-close (). It is also possible for the underlying TCP socket to suddenly close (for example, unplug the network cable, kill the process, exit the application, etc., of course these are very extreme cases). So allowing an application to know the difference between intentional interruption and accidental termination is key to a graceful wave.
In fact, when WebSocket is closed, the terminating endpoint (client/server) can send a numeric code along with a string indicating why it chose to close the socket. The numeric code is represented as a 16-bit unsigned integer because of a utF-8 encoded short string.
RFC 6455 defines a variety of special closing codes. Codes 1000 to 1015 are specified for the WebSocket connection layer. This code represents some failure in the network or protocol. The following figure lists the series of codes, descriptions, and situations where each code applies.
Extension: Relationship with socket
Socket is not actually a protocol. It works at the OSI model session layer (Layer 5) and is an abstraction layer that exists to make it easier for people to use lower-level protocols (typically TCP or UDP) directly. Socket uses TCP/IP to establish TCP connections. TCP connections rely on the underlying IP protocol, and IP connections rely on the link layer and other lower layers. It is mostly implemented in programming languages such as Java or C++, but not in browser based scripting languages such as javascript. The Websocket protocol has been supported by most browsers, which allows us to use TCP direct communication in the browser like socket communication.
WebSocket API-W3C
Developer.mozilla.org/zh-CN/docs/…
The constructor
Websocket
Instance attributes
- onclose
- onopen
- onmessage
- onerror
- .
API
const socket = new WebSoket(); Socket. onMessage = function (event) {console.log(" response message ", event.data); }; Socket. onOpen = function (event) {console.log(" connection open "); }; Socket. onclose = function (event) {console.log(" Connection closed ")}; Socket. Onerror = function (event) {throw new Error(" connection Error ")};Copy the code
Demo
reference
www.pubnub.com/blog/http-l…
Tools.ietf.org/html/rfc645…
zhuanlan.zhihu.com/p/72289051
Github.com/wettper/exp…
www.alloyteam.com/2017/01/htt…
The definitive guide to HTML5 WebSocket