World Wide Web (WWW)

Chinese is the World Wide Web

The World Wide Web is not the same as the Internet. The World Wide Web is just one of the services that the Internet can provide. It is a service that runs on the Internet

WWW core three concepts:

HTTP: TCP/ IP-based network (hypertext) transfer protocol HTML: Hypertext Markup Language (HTML). Hypertext can contain non-text elements such as images

Uris and URNs

Uniform Resource Identifier (URI) is a Uniform Resource Identifier (URIS), which is divided into URLS and UrNs

The Uniform Resource Locator (URL) is the Uniform Resource Locator (URL) that gives us an address. Generally, a URL is used as a URL

URN(Uniform Resource Name) specifies an ISBN number for each Resource. If we want to use URN, we must know the ISBN number, and of course the URL is preferred

The common components of a URL are:

The relationship among the three is as follows:

DNS

When you enter the url, it is not that you can only enter the url to find the resources, you need to find the server corresponding to the resources, because the resources are obtained from the server, you must find the IP address of the server to get the resources, so the browser must find the IP address.

DNS is for url /IP conversion.

If there is no DNS protocol, then we can only enter 119.75.217.109 to visit baidu home page. With DNS, you can access it by typing in www.baidu.com, an easy-to-remember spelling.

DNS translates a web address into an IP address, eliminating the need to remember a string of IP letters.

DNS domain name system (DNS) : After you enter a url, the browser sends the url to the server that resolves the domain name. The server returns a string of characters to the client browser. Then the browser starts three conversations with the server and downloads resources.

In addition, you can find Baidu IP through the command line: nslookup baidu.com: output Address: 220.181.57.216, baidu has many servers, so each person output Address is different, he will find the nearest server to you.

What the HTTP is doing when the server is communicating with the browser

Server + Client + http

  • The browser is responsible for making the request
  • The server receives the request on port 80 (HTTP port 80, HTTPS port 443)
  • The server is responsible for returning content (response)
  • The browser is responsible for downloading the response

What HTTP does is it instructs the browser and the server how to communicate. It dictates what to write in a request, what to write in a response,

When visiting a web page, the browser sends a request to the server where the web page is located. Before the browser receives and displays the web page, the server where the web page is located returns a server header containing the HTTP status code in response to the browser’s request.

The server has many interfaces, each with a fixed usage.

Format of request

HTTP request: Request line, request header, blank line and request data four parts.

Key1: value1 2 Key2: value2 2 Key3: value3 2 Content-type: Application/X-wwW-form-urlencoded 2 Host: www.baidu.com 2 User-agent: curl/7.54.0 3 4 Data to be uploadedCopy the code
  1. The request contains a maximum of four parts and a minimum of three parts. (That means section 4 can be empty)
  2. The third part is always a carriage return (\n) to distinguish the second part from the fourth part, which may be the password
  3. GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS, etc
  4. Put is updated as a whole, while patch is updated locally
  5. The path here includes “query parameters” but not “anchor points.”
  6. If you did not write the path, then the path defaults to /
  7. The Content-type in Part 2 annotates the format from Part 4
  8. HostIndicate the destination of the request (host domain name);User-AgentIs client-side information, which is important information for detecting the browser type, defined by the browser, and sent automatically in each request

Example:

GET /mix/76.html? Name =kelvin&password=123456 HTTP/1.1 Host: www.fishbay.cn upgrade-insecure -Requests: 1 user-agent: Mozilla / 5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36 Accept: text/html,application/xhtml+xml,application/xml; Q = 0.9, image/webp, * / *; Q =0.8 accept-encoding: gzip, deflate, SDCH Accept-language: zh-cn,zh; Q = 0.8, en. Q = 0.6 (data)Copy the code

Format of response

The HTTP response also consists of four parts: the status line, the message header, the blank line, and the response body.

1 Protocol/Version Status Code Description 2 Key1: value1 2 Key2: Value2 2 Content-Length: 17931 2 Content-type: text/ HTML Format of Part 4 3 4 Content to be downloadedCopy the code
  1. A GET request and a POST request can have the same or different responses
  2. The fourth part of the response can be very, very, very long
  3. The Content-type in Part 2 annotates the format from Part 4
  4. Content-type in Part 2 follows the MIME specification
  5. The response header is information that can be used by the client, such as:Date(The date the response was generated),Content-Type(MIME type and encoding format),Connection(Default is persistent)

The HTTP status code

The status code is what the server says to the browser.

The details are as follows:

  • 1xx (response message, indicating that the HTTP request has been accepted, continue processing the request)

    • 101 Switch protocol: The server switches the protocol based on the client’s request
  • 2XX (Successful response, indicating that the HTTP request has been processed)

    • 200 OK: The server has successfully processed the request (strong cache)
    • 201 Created: The request was successful and a new resource was created. This is typically a response sent after a PUT request. (User creates or modifies data successfully)
    • Accept: A request has entered the background, but no response has been received
    • 204 No content: The server successfully processed the request but did not need to return any entity content (user deleted successfully)
  • 3XX (redirection, which means to redirect the requested URL to another directory)

    • 301 move permanently: permanently redirects
    • 302 Moved Temporarily Redirect. The resource originally existed, but has been Temporarily relocated
    • 304 no modified: The page was not updated on the last request. Use cache to save bandwidth and overhead (negotiated cache)
    • A 307 temporary redirect is different from a 302 redirect. After receiving a 307 response code, the client sends a request to the new address using the same method
  • 4XX (An error occurs on the client request, indicating an error occurs on the client)

    • 400 Bad Request: The server does not understand the request syntax
    • 401 Unauthorized: User does not have permissions (incorrect user name and password)
    • 403 Forbidden: The user is authorized (401 is the opposite), but access is forbidden
    • 404 Not found: The server could not find the requested page
    • 408 Request timeout: Indicates that the request has timed out. The client has not completed sending a request within the time that the server is ready to wait.
  • 5XX (An internal error occurs on the server, indicating an error occurs on the server)

    • 500 interval server error: The server encounters an unknown error and cannot process requests
    • 501 Not Implemented: This request method is not supported by the server and cannot be processed. OnlyGETandHEADServer support is required
    • 503 Service Unavailable: The server is currently unavailable (overloaded or down for maintenance)
    • 505 HTTP version not support: The server does not support the requested HTTP version