This article without permission, prohibit republication in the continuous update……
The HTTP protocol
agreementRules or conventions that must be followed by two computers in a computer communication network to communicate with each other. To send and receive a message in an agreed format (a pre-determined communication format).
To transfer web content between the client and server, both sides must comply with the web content transfer protocol.
Web content is also called Hypertext, so the transfer protocol of web content is also called Hypertext Transfer Protocol (HTTP). It is a simple request-response protocol that typically runs on top of TCP. Does it specify what messages the client might send to the server and what responses it might get
The Http protocol is divided into two parts: request and response. The client initiates a request and the server responds to the request.
Request the Request
The client makes a request to the server in the form of a request line, request header, and request body
A get request
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the request line -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
GET /day02/01.php? username=lw&password=123456 HTTP/1.1
// GET request mode
// /day02/01.php? Username =lw&password=123456 Request path + parameter
// HTTP/1.1 HTTP version number
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- request header -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
// Host: indicates the Host address
Host: www.study.com
// HTTP1.1 is enabled by default. After a connection is established, the TCP connection will not be disconnected and can be used next time
Connection: keep-alive
// This is a chrome extension
Upgrade-Insecure-Requests: 1
// Proxy string for browser (version information)
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.03029.96. Safari/537.36
// A browser-accepted type.
Accept: text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,` ` * / *; q=0.8
// From which page the request was made
Referer: http://www.study.com/day02/01-login.html
// Check the compression methods supported by the browser
Accept-Encoding: gzip, deflate, sdch
// Browser supported language, Chinese is preferred.Accept-Language: zh-CN,zh; q=0.8,en; q=0.6
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- request body -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
// GET requests have no request body, but parameters are concatenated to the request line
Copy the code
A post request
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the request line -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
POST /day02/01.php HTTP/1.1
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- request header -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Host: www.study.com
Connection: keep-alive
// The length of the argument to pass.
Content-Length: 29
Cache-Control: max-age=0
Origin: http://www.study.com
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.03029.96. Safari/537.36
// Content type: form data, which must be specified if a POST request is made.
Content-Type: application/x-www-form-urlencoded
Accept: text/html,application/xhtml+xml,application/xml; q=0.9,image/webp,` ` * / *; q=0.8
Referer: http://www.study.com/day02/01-login.htmlAccept-Encoding: gzip, deflate Accept-Language: zh-CN,zh; q=0.8,en; q=0.6
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- request body -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
username=lw&password=123456
Copy the code
GET versus POST requests:
-
The GET request has no body because the parameters of the GET request are concatenated to the address bar
-
A POST request has a request body, which is the parameters passed
-
POST requests require a content-Type attribute.
In Response to the Response
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- the status line (response) -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -
HTTP/1.1 200 OK
/ / HTTP / 1.1 HTTP version
//200 Response status
//200 indicates success
//404 Indicates that the resource cannot be found
//500 indicates a server error
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- response header -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Date: Thu, 22 Jun 2017 16:51:22 GMT
Server: Apache/2.423. (Win32) OpenSSL/1.0.2j PHP/5.445.
X-Powered-By: PHP/5.445.
Content-Length: 18
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
// The content type tells the browser how to parse the responseContent-Type: text/html; charset=utf-8
/ / -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- response body -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --User login succeededCopy the code
Common status codes of HTTP requests
-
1xx: Temporary response. The request has been received. Please continue
-
2xx: The request is successful
-
3xx: indicates that the target of the request has changed and the client is expected to further process it
-
301 Resources are permanently transferred
-
302 Resources are temporarily transferred
-
304 The client cache is not updated
-
-
4xx: The client request is incorrect
-
400 The client request has a syntax error that the server cannot understand
-
401 Request is not authorized, requires authentication, usually token invalid
-
403 The server receives the request, but refuses to execute it. Generally, it is insufficient
-
404 The requested resource does not exist
-
-
5xx: An error occurs on the server
HTTP common request methods
-
A GET request typically visits a page to retrieve data
-
Post requests are typically form submissions
-
A HEAD is similar to a GET request, except that there is no concrete content in the response returned, and the user retrieves the header
-
PUT Adding a resource
-
DELETE DELETE a resource
-
CONNECT is used to establish a tunnel to the server identified by the given URI. It uses a simple TCP/IP tunnel change request connection, usually using a decoded HTTP proxy for SSL-encoded communication (HTTPS)
-
OPTIONS describes the communication OPTIONS of the target resource and returns HTTP policies that support predefined urls
-
TRACE displays requests received by the server for testing
What is the difference between HTTP2 / HTTP1
-
HTTP2 uses binary rather than text formats, which parse more efficiently and with fewer errors than text formats
-
HTTP2 is fully multiplexed, non-ordered and blocking —- requires only one connection to achieve parallel, multiplexing means that it can simultaneously process multiple message requests and responses, HTTP1 is a connection can only submit one request at a time, the efficiency is more slow
-
Using header compression,HTTP2 reduces overhead, HTTP1 header is very redundant,HTTP2 is the different parts of the header with different index representation, and will use Huffman encoding compression string, finally encapsulated into a frame
-
HTTP2 allows the server to actively “push” the response to the client cache, HTTP2 server will actively push resources to the client, such as JS and CSS files to the client without the client to parse the HTML request and then respond
HTTP request process
DNS domain name resolution, resolve the corresponding domain name into an IP address –> find the corresponding server according to the IP address –> initiate TCP three-way handshake –> initiate HTTP request after establishing TCP connection –> The server responds to the HTTP request. The browser gets the HTML code -> the browser parses the HTML code and requests resources in the HTML code (such as JS, CSS, images, etc.) -> the browser renders the page to the user
The HTTPS protocol
The security-oriented HTTP channel is the secure version of HTTP. HTTPS is based on SSL. SSL is located between TCP/IP and various application-layer protocols and provides security support for data communication. The SSL Protocol is divided into two layers: SSL Record Protocol (SSL Record Protocol). It is based on reliable transport protocols (such as TCP) and supports basic functions such as data encapsulation, compression, and encryption for high-level protocols. The SSL Handshake Protocol is based on the SSL recording Protocol. It is used for identity authentication, encryption algorithm negotiation, and encryption key exchange between communication parties before data transmission.
Difference between HTTP and HTTPS
-
For HTTPS, you need to apply for a Certificate from the Certificate Authority (CA). Generally, there are few free certificates, which requires a certain cost. (Before, the official website of netease was HTTP, while the mailbox of netease was HTTPS.)
-
HTTP is a hypertext transmission protocol, and information is transmitted in plain text. HTTPS is a secure SSL encryption transmission protocol.
-
HTTP and HTTPS use completely different connections and use different ports, the former 80 and the latter 443.
-
HTTP connections are simple and stateless. HTTPS is a network protocol that uses SSL and HTTP to encrypt transmission and authenticate identity. It is more secure than HTTP. Stateless means that packets are sent, transmitted, and received independently of each other. Connectionless means that neither party maintains any information about the other for long.
TCP network protocol
TCP is a connection-oriented, reliable, byte stream – based transport-layer communication protocol. It provides full duplex communication in which data is transmitted independently in both directions. Each TCP connection can have only two endpoints, and each TCP connection can be point-to-point only.
TCP three-way handshake
The photo here is from Baidu Photo
-
First handshake: The client sends a SYN packet to the connect server, waits for the server to confirm, and the server is opened passively
-
Second handshake: The server receives a SYN packet, acknowledges the client’s SYN, and sends a SYN packet and an ACK to the client
-
Third handshake: After receiving the SYN+ACK message from the server, the client sends an ACK packet to the server. Then, the client and server enter the TCP connection status and complete the three-way handshake
TCP’s four waves
The photo here is from Baidu Photo
Note: Interrupts can be client or server
-
First wave: Host 1 (either the client or server) sends a FIN packet to host 2. At this point, host 1 enters the FIN_WAIT_1 state. This indicates that host 1 has no data to send to host 2.
-
Second wave: Host 2 receives the FIN packet from host 1 and sends an ACK packet to host 1. Host 1 enters the FIN_WAIT_2 state. Host 2 tells Host 1 that I “agree” to your shutdown request.
-
Third wave: Host 2 sends a FIN packet to host 1 to close the connection, and host 2 enters the LAST_ACK state.
-
Fourth wave: Host 1 receives the FIN packet from host 2 and sends an ACK packet to host 2. Then host 1 enters the TIME_WAIT state. Host 2 closes the connection after receiving the ACK packet from host 1. If host 1 does not receive a reply after waiting for 2MSL, then the Server is shut down.
Advantages and disadvantages of TCP
-
Advantages: TCP sends an ordered packet number. After receiving the packet, the peer sends a feedback. If the feedback is not received within a certain period of time, the peer automatically resends the packet
-
Disadvantages: Very simple, is troublesome, if the amount of data is relatively small, the process of establishing a connection takes a large part, and the continuous retransmission will cause network delay, so for example, video chat is usually used UDP, because it does not matter if some packets are lost, the speed is important
Differences between TCP and UDP
UDP is a connectionless transport layer protocol in the OSI reference model, which provides a transaction-oriented simple unreliable information transfer service
Differences between the two:
-
TCP is a connection-oriented transmission control protocol (TCP). UDP provides a connectionless datagram service, which does not need to establish a connection before sending data
-
TCP provides reliable service. Data transmitted through TCP connections is error-free, not lost, not repeated, and arrives in sequence. UDP does not guarantee reliable delivery
-
TCP is byte – oriented and UDP is packet – oriented. UDP does not have congestion control, so network congestion does not slow down the sending rate of the source host (useful for real-time applications such as IP telephony and real-time video conferencing).
-
Each TCP connection can be point-to-point only. UDP supports one-to-one, one-to-many, many-to-one and many-to-many interactive communication
-
TCP header cost 20 bytes; The header of UDP has a small overhead of only 8 bytes
-
TCP logical communication channel is full-duplex reliable channel, UDP is unreliable channel
Why do TCP connections require three handshakes and four waves
-
Why three handshakes
To prevent the invalid connection request message segment from being sent to the server suddenly, an error is generated. Assume that the first connection request message segment sent by the customer stays on a network node for a long time during the two handshakes, so that it is delayed until the connection is released and reaches the server. After receiving an invalid connection request packet segment, the server considers that the client sends a new connection request. Then, the client sends a confirmation message to agree to establish a connection. At this time, the connection is successfully established on the premise of two handshakes. This will result in wasted server resources
-
Why four waves
TCP is a full-duplex communication, this means that the client and server can send data to each other, so close the connection is a common behavior, both sides need to confirm the hypothesis is three waves, the direction of the first release of the client to the server connection, a TCP connection is closed at this time, then the customer can’t send the data to the server, The server can still send data to the client. If the client receives an acknowledgement message from the server, the client sends an acknowledgement message immediately, which will cause the connection to be closed while the server is still sending data to the client. In this case, the client does not receive the entire packet segment from the server
Components of the browser
-
User interface: User interface – includes address bar, forward/back buttons, bookmark menu, etc. All parts of the display belong to the user interface, except for the page you requested displayed in the browser’s main window
-
Browser Engine – transmits instructions between the user interface and the rendering engine
-
Rendering engine – is responsible for Rendering the requested content. If the requested content is HTML, it is responsible for parsing the HTML and CSS content and displaying the parsed content on the screen
-
Networking – Used for network calls, such as HTTP requests. Its interfaces are platform independent and provide an underlying implementation for all platforms
-
JavaScript Interpreter: Js parser – used to parse and execute JavaScript code, such as Chrome’s JavaScript Interpreter is V8
-
UI Backend: User interface Backend – used to draw basic widgets, such as combo boxes and Windows. It exposes a common interface that is platform-independent, while underneath it uses the operating system’s user interface approach
-
Data Persistence: Data store – This is the Persistence layer. Browsers need to keep all kinds of data, such as cookies, on their hard drives. The new HTML specification (HTML5) defines a “web database,” which is a complete (but lightweight) in-browser database
Browser Rendering process
-
Parsing HTML, generating DOM tree, parsing CSS, generating CSS rules
-
Generate Render Tree from DOM Tree and CSS rules
-
Calculate the location and size information for the Layout
-
Draw the page