The interview was asked HTTP protocol? This article is sufficient to cover all relevant issues

Blog.csdn.net/qq_35116353…

HTTP Hypertext transfer protocol

HTTP uses connection-oriented TCP as the transport layer protocol. HTTP itself has no connection.

The request message

CRLF is carriage return line feed

The method is GET request packets

GET /search? Hl = useful - CN&source = hp&q = domety&aq = f&oq = HTTP / 1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-silverlight, application/x-shockwave-flash, */* Referer: <a href="http://www.google.cn/">http://www.google.cn/</a> Accept-Language: zh-cn Accept-Encoding: Gzip, Deflate User-agent: Mozilla/4.0 (Compatible; MSIE 6.0; Windows NT 5.1; SV1; The.net CLR 2.0.50727; TheWorld) Host: <a href="http://www.google.cn">www.google.cn</a> Connection: Keep-Alive Cookie: PREF=ID=80a06da87be9ae3c:U=f7167333e2c3b714:NW=1:TM=1261551909:LM=1261551917:S=ybYcq2wpfefs4V9g; NID=31=ojj8d-IygaEtSxLgaJmqSjVhCspkviJrB6omjamNrSm8lZhKy_yMfO2M4QMRKcH1g0iQv9u-2hfBW7bUFwVh7pGaRUb0RnHcJU37y- FxlRugatx63JLv7CWMD6UB_O_r 123456789101112Copy the code

The method is POST request packets

The POST HTTP / 1.1 / search the Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, Msword, Application/x-Silverlight, Application /x-shockwave-flash, */* Referer: <ahref="http://www.google.cn/">http://www.google.cn/</a> Accept-Language: zh-cn Accept-Encoding: Gzip, Deflate User-agent: Mozilla/4.0 (Compatible; MSIE 6.0; Windows NT 5.1; SV1; The.net CLR 2.0.50727; TheWorld) Host: <a href="http://www.google.cn">www.google.cn</a> Connection: Keep-Alive Cookie: PREF=ID=80a06da87be9ae3c:U=f7167333e2c3b714:NW=1:TM=1261551909:LM=1261551917:S=ybYcq2wpfefs4V9g; NID=31=ojj8d-IygaEtSxLgaJmqSjVhCspkviJrB6omjamNrSm8lZhKy_yMfO2M4QMRKcH1g0iQv9u-2hfBW7bUFwVh7pGaRUb0RnHcJU37y- FxlRugatx63JLv7CWMD6UB_O_r hl=zh-CN&source=hp&q=domety 1234567891011121314Copy the code

methods

OPTIONS: This method causes the server to return all HTTP request methods supported by the resource. Use ‘*’ instead of the resource name to send an OPTIONS request to the Web server to test whether the server functions properly.
HEAD: Like the GET method, it makes a request to the server for a specified resource. Only the server will not return the text portion of the resource. The advantage is that you can retrieve “information about the resource” (meta information or metadata) without having to transfer the entire content.
GET: sends a display request to the specified resource. Using the GET method should only be used to read data and should not be used for “side effects” operations, such as in Web Applications. One reason is that GET can be accessed randomly by web spiders and so on. See Also Safety methods
POST: Submits data to a specified resource, asking the server to process it (for example, submit a form or upload a file). The data is included in the request article. This request may create a new resource or modify an existing resource, or both.
PUT: Uploads the latest content to the specified location.
DELETE: requests the server to DELETE the resource identified by request-uri.
TRACE: displays the requests received by the server for testing or diagnosis.
CONNECT: reserved in HTTP/1.1 for proxy servers that can pipe connections. Typically used for links to SSL encrypted servers (via an unencrypted HTTP proxy server). Although there are eight HTTP request methods, we commonly use get and POST in practical applications, and other request methods can also be realized indirectly through these two methods.

URL

The general components of a URL are < protocol >://< host >:< port number >/< path >

agreement

HTTP hypertext Transfer Protocol resource HTTPS Hypertext Transfer Protocol over Secure Socket Layer FTP File Transfer protocol mailto E-mail address LDAP Light Directory Access Protocol Search file Files shared on a local computer or network news Usenet newsgroup Gopher — Gopher Telnet — Telnet

Host – refers to a domain name on the Internet
Ports can sometimes be omitted
Absolute URL Displays the complete path of the file, which means that the location of the absolute URL itself is independent of the location of the actual file being referenced. Relative urls Describe the location of the target folder using the location of the folder containing the URL itself as a reference point. If the path omits the URL, it refers to a home page on the Internet.

https://zhidao.baidu.com/
https://zhidao.baidu.com/question/1742817.html
 
12
Copy the code

The first URL omits the path and represents the home page baidu knows. The second is the relative path to the file 1742817.html, indicating its location. They all use the HTTPS protocol. The port number is omitted.

The version number

The protocol previously used was HTTP/1.0, which has been upgraded to HTTP/1.1. What’s the difference between the two?

The time required to request a WWW document is 2*RTT+ document transfer time. Since it takes three handshakes to establish a TCP connection with the server, the third handshake carries data associated with sending the request, and then the HTTP server responds with a total of four interactions, or 2*RTT time. In addition to some other overhead, the World Wide Web server has to serve a large number of clients, so each visit requires a connection, which in HTTP/1.0 was a heavy burden on the non-continuous connection (short link) server. HTTP/1.1 uses persistent connections (long links) that remain open even after the server sends a response. Continuous linking is also divided into pipeline mode and non – pipeline mode. The non-pipelined approach dictates that the client sends a browse request and gets a response before sending the next one. In pipelined mode, customers can send the next request without waiting for a response, and the server can continuously respond to the request without waiting, saving time.
HTTP 1.1 persistent connections also require a new request header to help implement.

For example, if the Connection request header is keep-alive, the client notifies the server to return the request result and Keep the Connection. Connection When the value of the request header is close, the client notifies the server to return the request result and then closes the Connection.
HTTP 1.1 also provides request and response headers related to mechanisms such as authentication, state management, and Cache caching.

HTTP header field

There are four types of HTTP header fields: common header fields, request header fields, response header fields, and entity header fields.

Common header field: the header used by both the request and response packets.
Request header field: the header used when sending request packets from the client to the server.
Response header field: the header used to return response packets from the server to the client.
Entity header field: The header used for the entity portion of the request and response messages.

HTTP/1.1 header field

Generic header field

Header field name	instructions
Cache	Controls the behavior of caching
Connection	Hop – by – hop header and connection management
Date	Date and time when the packet was created
Pragma	Packet instructions
Trailer	View the header of the packet end
Transfer-Encoding	Specifies the transmission code of the packet body
Upgrade	Upgrade to another protocol
Via	Proxy server information
Warning	Error notification

Request header field

Header field name	instructions
Accept	The type of media that the user agent can handle
Accept-Charset	Preferred character set
Accept-Encoding	Priority content encoding
Accept-Language	Preferred language (natural language)
Authorization	Web Authentication Information
Expect	Expect specific behavior from the server
From	Email address of the user
Host	Request the server where the resource resides
if-Match	Compare Entity Tag (ETag)
if-Modified-Since	Compares the update times of resources
if-None-Match	Compare entity tags (as opposed to if-match)
if-Range	Send scope requests for entity Byte when the resource is not updated
if-Unmodified-Since	Compare resource update times (as opposed to if-modified-since)
Max-Forwards	Maximum transmission hop by hop
Proxy-Authorization	The proxy server requires authentication information of the client
Range	Byte range request for the entity
Referer	The original method of getting the URI in the request
TE	Priority of transmission encoding
User-Agent	HTTP client program information

Response header field

Header field name	instructions
Accept-Ranges	Whether to accept byte range requests
Age	Calculate the elapsed time of resource creation
ETag	Matching information of resources
Location	Causes the client to redirect to the specified URI
Proxy-Authenticate	The proxy server authenticates the client
Reter-After	Request the timing of the request to be made again
Server	HTTP server installation information
vary	Proxy server cache management information
WWW-Authenticate	Authentication information about the server to the client

Entity head field

Header field name	instructions
Allow	HTTP methods supported by the resource
Content-Encoding	The applicable encoding of the entity body
Content-Language	The natural language of entity subjects
Content-Length	Size of entity body in bytes
Content-Location	Replace the URI of the corresponding resource
Content-MD5	The packet digest of the entity body
Content-Range	The location range of the entity body
Content-Type	The media type of the entity body
Expires	The date and time when the entity body expires
Last-Modified	The last modified date and time of the resource

HTTP Operation Procedure

HTTP is an application-layer protocol for things. Every World Wide Web site has a server process that constantly listens on TCP port 80 for a connection request from a browser. Once a connection is established, the browser sends a request to the Web server for a page to be viewed. Browsers and servers must follow the format and certain rules, the hypertext transfer protocol HTTP. Use HTTP/1.0 to describe events after a user makes a browsing request (enter a URL in the browser address or click on an optional event and the browser will automatically find the page to connect to). 1. The browser analyzes the URL. 2. Request the IP address of the resolved domain name from the DNS. 3. Obtain the IP address. 3. The browser server establishes a TCP connection (IP address + port number). 4. Run GET /question/1742817.html 5. The server responds by sending 1742817.html to the browser. 6. Release the TCP connection. 7. The browser displays HTML text.

The response message

Status code and phrase

1XX: indicates that the request has been received and processing continues. 2xx: success – The request is successfully received, understood, or accepted. 3xx: Redirect – Further action must be taken to complete the request. 4XX: client error – The request has a syntax error or the request cannot be implemented. 5xx: Server side error – The server failed to fulfill a valid request.

The following describes common status codes and status descriptions.

200 OK: The client request is successful. 400 Bad Request: The client Request has syntax errors and cannot be understood by the server. 401 Unauthorized: The request is not authorized. This status code must be used with the WWW-Authenticate header field. 403 Forbidden: The server receives requests but refuses to provide services. 404 Not Found: The requested resource does Not exist, for example: an incorrect URL was entered. 500 Internal Server Error: An unexpected Error occurs on the Server. 503 Server Unavailable: The Server is currently unable to process client requests and may recover after a period of time, for example, HTTP/1.1 200 OK (CRLF).

The difference between GET and POST methods

1.GET submits, the requested data is appended to the URL (that is, the data is placed in the HTTP header < request-line >), so that? Split URL and transfer data, multiple parameters with & link; For example: the login. The action? Name = hyddd&password = idontknow&verify = A5 E5 A0 BD E4% % % % % % BD. If the data is English letters/digits, send it as is, if it is space, convert it to +, if it is Chinese/other characters, then the string is directly encrypted with BASE64, such as: %E4%BD%A0%E5%A5%BD, where XX in % XX is the HEXadecimal ASCII of the symbol.

POST submission: The submitted data is placed in the < request-body > of the HTTP package. The actual data transferred is indicated in red in the example above

Therefore, data submitted by GET will be displayed in the address bar, while data submitted by POST will not change the address bar

2. Size of transmitted data:

First, the HTTP protocol does not limit the size of the data transferred, and the HTTP protocol specification does not limit the length of urls. The main limitations in actual development are:

GET: The URL length is limited by specific browsers and servers. For example, the URL length is limited by 2083 bytes (2K+35) for Internet Explorer. For other browsers, such as Netscape, FireFox, etc., there is theoretically no length limit, which depends on the operating system support.

Therefore, for GET submissions, the transmitted data is limited by the LENGTH of the URL.

POST: Since the value is not sent through the URL, the data is theoretically unlimited. However, each WEB server sets a limit on the size of the data submitted by post. Apache and IIS6 have their own configurations.

3. Security: POST is more secure than GET. Note: Security is not the same as “security” mentioned in GET above. The above “Security” only means that the data is not modified, and here the Security means the real Security, such as: By submitting data through GET, the user name and password will appear in clear text at the URL because (1) the login page may be cached by the browser, and (2) other people can view the browser history so that others can GET your account and password.