If this article is useful to you, you can search my wechat public number: climb, which will push the latest blog articles in time, we make progress together oh!
For a better reading experience, check out the latest content at climbtw.com at blog.climbtw.com.
This paper mainly deals with HTTP headers, caching mechanism, character set types, URL encoding and other relevant content in HTML and meta information tags.
directory
- Summary of key/value pairs in HTTP headers
- 1.1 HTTP Request Header (Usually set by clients and transparent to users)
- 1.2 HTTP Response Headers
- 2 MIME brief description of message content types
- 3 Brief description of compression formats
- 4 Browser, server caching mechanism
- 4.1 Description of cache-Control values in Request Headers
- 4.2 Description of cache-Control values in Response Headers
- 5 HTTP status message
- 6 HTTP Methods (GET, POST, etc.)
- 6.1 Comparison between GET and POST
- 6.2 Other HTTP Request Methods
- 7 GMT time string (and method of converting to custom time format)
- 8 Uniform Resource Locator URL
- 8.1 Whether the WWW is configured for the domain host
- 9 Character set type
- 9.1 ASCII Character Set (1 byte, 128 characters, entity number, entity Name) (excluding Chinese)
- 9.2 ANSI Character Set (2 bytes, 65536 bytes, incompatible by country)
- 9.3 GB2312, GBK encoding (ANSI Chinese version, 2 bytes, 65536 bytes)
- 9.4 ISO Character Set (ISO-8859-1, etc., 1 byte, 256 characters, entity number, entity name)
- 9.5 Unicode encoding (2 or 4 bytes. Utf-8, UTF-16, 1-4 bytes) (including Chinese), encoding representation method
- 10 Default encoding format of URL characters (including GET and POST)
- 10.1 ENCODING Format of URL Characters
- 10.2 Default encoding of pathInfo(non-parameter part) and queryString(parameter part) in urls in Different Browsers
- 10.3 URL Manual Encoding Solution
- 10.4 The Server Configures the URL decoding mode and controls the URL encoding mode of the browser
- 10.5 Summary: Codec process of the server <-> browser
- HTTP Referer anti-theft chain and anti-theft chain
- 12 Website host selection
- 12.1 Issues to Be Considered when Setting up a Server by Yourself
- 12.2 Using an Internet Service Provider (ISP)
Summary of key/value pairs in HTTP headers
- HTTP: HyperText Transfer Protocol. Is a request-reply protocol between a client and a server. It contains a header, a body.
- The HTTP headersIncluding:The clientSend theRequest header, andThe service sideThe returnedResponse headers.
- Keys/values in HTTP headers, separated by colons, are case insensitive.
- A pair of key/value attributes http-equiv/Content in the meta information tag
whose key values are set in the response header sent by the server.- All of the key-value pairs in the request header and the response header are described below.
1.1 HTTP Request Header (Usually set by clients and transparent to users)
Request header | instructions | The sample |
---|---|---|
accept | The client can accept itThe MIMIE content type of the response body. Corresponds to the values in the response headercontent-typeField. Generally, it is set by the client. Transparent to the user. Hopefully, but it’s up to the server to decide exactly what type of content the server returns, butThe client will receive the response regardless of the content type returned by the server, right, it is impossible to say that the server cannot receive response packets due to different content types. This does not comply with the HTTP protocol specifications. We make a get or POST request from the browser, and this field is automatically added by the browserThe server side also does not parse the value of this field;Through Ajax requests or other means, we can set the value of this field, but we usually don’t. The field ofApplication scenariosIt could be something like this: there are two terminals, let’s say one isPlain text reader, such as Kinder (can’t display pictures), another isMobile terminals(can play pictures and videos), all request information about “zebra” to the server. In this case, the server needs to determine which terminal should return what information, so it can be based onacceptTo make a judgment. If the parse value of Accept is“text/plain”, which means that the client only supports text types; If toOn the right side of the case, it means that the client text, pictures, video can be. But if we don’t judge, when we return toText readerA piece ofThe pictureWhen, maybe what it shows isThe statement. |
Accept field in baidu search header: accept: text/html, application/xhtml+xml, application/xml; Q = 0.9, image/webp, image/apng, * / *; Q = 0.8, application/signed-exchange; v=b3 Standards for the message content type MIME are summarized below. |
content-type | Sent by the clientThe MIME type of the request body. This refers to the pair in the POST requestThe encoding of the URL(that is, set to the URLSpecial characters 和 The blank spaceWhether to encode or not) As for theThe URLHow do you code itResponse headersIn thecontent-type 或 charsetField. If the server is not setcontent-type 或 charsetFields will be used by the browserThe default encodingSet as follows: For GET requests:Chrome.FireFoxFor both paths and parametersUTF-8Encoding.IEFor the path as wellUTF-8, but forParameters are coded in the local environment, such as Chinese GB2312. For POST requests: Chrome, IE, and FireFox use paths and parametersUTF-8Encoding. |
content-type: application/x-www-form-urlencoded As for the encoding format and encoding format of URL characters by browsers, it will be summarized below. |
accept-charset | Acceptable to the clientCoding format. Corresponds to the values in the response headerchartset. This value is typically not set unless the user asks the browser to use a specific encoding format. But it is better to set the page encoding in the response header to inform the browser if it parses, rather than using the browser’s default encoding. When the server returns a packet, thecharacterAccording to certainCoding formatconvertSequence of bytesSend to the client. As a server, it can use any encoding method, the client has to receive the complete response message. And nowClients almost all support common encoding types. So when the server returns data, it only needs to follow theEstablished encoding methodCode, and then inThe response message 中 Inform the client of the encoding method used. In this way, the client decodes the received packets in this way to avoid garbled characters. However, if the client has already decided to use a certain decoding method, then the server can not be so caprices, it needs to parse the accept-charset field, based on this value, to set the encoding method, as follows: 1. Return yesHTMLPage, is in <meta /> Set in the tag;2. If yesResp Outputstream returns native contentIs displayedResponse headers content-typeField to specify the encoding format. 3. Return yesJSPPage, is specifiedpageEncoding; So, if I want to make sure that I’m not garbled in any case,The server must inform the client of the encoding format used |
accept-charset:gbk,utf-8; Q = 0.8 |
accept-encoding | Supported by the clientExtract format. Corresponds to the values in the response headercontent-encodingField. Generally, the client set, and then the server according to the requirements of compression, browser decompression. Transparent to the user. Network data transmission is bandwidth – intensive, and willThe file dataCompression, canReduce data volumeTo reduce transmission time. Therefore, when the server returns data to the client, it usually compresses the data (transparently to the user, usually done by the server or proxy). The compression method can be used in various ways, depending on which decompression method is supported by the client. And then you can say headersaccept-encodingThe value of the. Compression of files or data, done by servers or agents, usually without programmer intervention; When the client receives the data, the decompression is usually done automatically by the browser and is transparent to the user. For ajax requests that we initiate actively, the data volume is usually small and this field is not required. |
Accept-encoding field in baidu search header: accept-encoding: gzip, deflate, br The format of the decompression is summarized below. |
accept-language | Acceptable to the clientThe language list of the response body Corresponds to the values in the response headercontent-languageField. Generally, it is set by the client. Transparent to the user. When the browser makes a request directly, the browser appends this field to the locale (the default language). Generally, the server ignores this field when parsing packets. hisUsage scenariosIt could look like this: let’s say we have a file with different language versions, so that when different requests come in, we can use the accept-language value to determine which language version to return to the client. (In fact, this application scenario is not so commonDo not useThe method to determine the Accept-language field because of thisunreliable. canRepresent the language version directly in the URL) |
Accept-language field in baidu search header: accept-language: zh-cn,zh; Q = 0.9 |
origin | Launch a campaign againstCross-domain resource sharingRequest for the current valuehost. Corresponds to the values in the response headeraccess-control-allow-originField. The request requires that the server be inResponse headersAdd aaccess-control-allow-originRepresents what the server allowsList of cross-domain sources. |
origin: http://www.itbilu.com |
cookie | Put the clientCookie informationSend to the server. Corresponds to the values in the response headerset-cookieField. Generally, it is set by the client. Transparent to the user. Key/value usage=Join, used between different key-value pairs;separated. |
Cookie field in baidu search header (intercept part) : Cookie: BAIDUID=2B0B46FB4D7624852C26884029FB5E4A:FG=1; BIDUPSID=2B0B46FB4D7624852C26884029FB5E4A; PSTM=1567362808; BD_UPN=12314753 |
cache-control | Used to specify whether to use it in the current requestThe agentIn theCache file. Corresponds to the values in the response headercache-controlField. |
Cache-control field in baidu search header: cache-control: max-age=0 The browser caching mechanism is summarized below. |
if-modified-since | Indicates the last modification time of the client cache resource. Corresponds to the values in the response headerlast-modifiedField. General client, browser to set. Transparent to the user. The browserSet the value: When sending a request, the browser automatically sends the request based on the information in the previous response headerlast-modifiedProperty (when the server file was last modified) to set thisModify the time. The server then automatically determines whether the last modification of the accessed resource was later than if-modified-since. If no later than or equal to the value, the cached resource is the latest and the server returns it304 unmodified 的 HTTP status message. Indicates that the client can directly use the local cache, saving bandwidth. Browsers generally only cache static resources such as HTML, JPG, CSS and JS. They do not cache dynamic results of JSP pages and Ajax requests. In addition, static resources should be CDN accelerated and hosted on static servers. Because server bandwidth is precious. |
If-modified-since: Thu, Jun 22 2017 19:07:30 GTM+0800 The value is a string in GMT format. The HTTP status messages and HTTP methods and GMT time formats are summarized below. |
if-none-match Higher priority |
Representing the client cache resourceHash value. Corresponds to the values in the response headeretagField. It has the same function as if-modified-since. General client, browser to set. Transparent to the user. The browserSet the value: When sending a request, the browser automatically sends the request based on the information in the previous response headeretagProperty (hash value of the server file) to set this value. The server will then automatically determine whether the hash values are consistent and decide to return304 unmodifiedOr files. butetag / if-none-matchIs more important than last-modified / if-modified-sinceTo be higher. |
if-none-match: “9jd00cdj34pss9ejqiw39d82f20d0ikd” |
referer | Represents the current pageSource of the jump. Generally, it is set by the client. Transparent to the user. Often used on websitesAccess statisticsFor example, I have made advertisement links to the main page of my website in many places. At this time, I can use the referer to check where there are many people who jump to it, so that the effect of advertisement is good. In addition, referer is often usedPreventing hotlinkingTo configure interception on the server. “Referer” was actually the word “Referrer,” but the RFC misspelled the standard and used it instead. |
Referer field in baidu search header:referer: https://www.baidu.com/ |
connection | Keep client and serverThe connectionFeatures. Generally, it is set by the client. Transparent to the user. HTTP is a stateless, connection-oriented protocol that itself has no memory for transactions, meaning that the server does not know the state of the browser. For example, even if you log in and visit different pages on the same site, the server won’t know who you are. If you need to record the login user information, user operations, user behavior and other data must use cookies or session to store. Since HTTP / 1.1,All browsers have connection: keep-alive enabled by defaultTo keepThe connectionFeatures. For example, after a web page is opened, the TCP connection used to transmit HTTP data between the client and the server is not closed. If the client accesses the web page on the server again, the existing TCP connection is used. Connection: keep-alive Does not keep the connection permanently. Both the client and the server can choose to close the connection at any time: The clientSet in the request headerconnection: close. The serverYou need to set this parameter based on the server type (for example, Apache)The hold time of the connection. |
Connection field in baidu search header: connection: keep-alive |
host | Specifies the HTTP server that the client wants to accessThe domain name 或 The IP address, you can add the port number (if not, the default HTTP port is 80). Generally, it is set by the client. Transparent to the user. |
Host field in baidu search header:host: www.baidu.com Uniform Resource Locator urls, and URL character encodings, are summarized below. |
user-agent | Presentation clientSoftware environment Generally, it is set by the client. Transparent to the user. The server can evaluate the client’s environment based on this field and give different responses. (for example, returning different versions of the page depending on whether the request was initiated from a mobile or a computer) |
In Chrome, the user-Agent field in the baidu search header is: User-Agent: Mozilla / 5.0 (Windows NT 10.0; Win64; x64) AppleWebKit / 537.36 (KHTML, like Gecko) Chrome / 76.0.3809.132 Safari / 537.36 The reason for the browser UA confusion: The browser only recognizes the Mozilla that was developed first, and gives preference to the browser that supports the better Gecko kernel. So newer browsers, in order for browsers to better identify themselves, have started to imitate Mozilla’s UA logo. IE masquerades as Mozilla KHTML disguised as Gecko WebKit masquerades as KHTML Finally Opera masquerades as any of the browsers above and allows the user to decide who they want the browser to be. This is the way, to pretend that their own mother do not know, who are not who, who are who. |
from oooooooooooooooooo |
The email address of the user who initiated this request | from: [email protected] |
1.2 HTTP Response Headers
Response headers | instructions | The sample |
---|---|---|
content-type | Notifies the browser of the MIME type of the current content | content-type: text/html; charset=UTF-8 |
charset | Notifies the browser of the decoded format of the current content | charset: UTF-8 |
content-encoding | Notifies the browser of the compression format used for the current resource | content-encoding: gzip |
content-language | Sound on the content of the language used | content-language: zh-cn |
access-control-allow-origin | Notify the browser which web sites can be shared across domain source resources | access-control-allow-origin: * |
set-cookie | Set the HTTP cookies | set-cookie: ctoken=O5kWnZU24hNA4eJq; domain=.mayibank.net ; expires=Wednesday, 20-Jun-2007 22:33:00 GMT; path=/; Max-Age=3600; Version=1 |
cache-control | Notify all caching mechanisms, from the server to the client, of whether or not they can cache the object and for how long. The unit is second | cache-control: max-age=3600 |
last-modified | The last modification date of the requested resource object. The server automatically adds a last-Modified field to the static file response message, which is used to set the if-modified-since value at request time | last-modified: Dec, 26 Dec 2015 17:30:00 GMT |
etag | The hash value of the requested resource object. It has the same functionality as Last-Modified. The server also automatically adds an ETAG field to the response packet to set the value of if-none-match | etag: “737060cd8c284d8af7ad3082f209582d” |
age | The duration, in seconds, of the response object in the proxy cache | age: 12 |
expires | Specify a date/time after which this response is considered expired | expires: Thu, 01 Dec 1994 16:00:00 GMT |
refresh | Used for redirection, or when a new resource is created. The redirection will refresh after 5 seconds by default | refresh: 5; url=http://itbilu.com |
location | Used when redirecting, or when a new resource is created. | location: http://www.itbilu.com/nodejs |
status | The response header field of a generic gateway interface that describes the response status of the current HTTP connection | status: 200 OK |
server | Server name | Server: nginx / 1.6.3 |
warning | General warning that there may be an error in the entity content body | warning: 199 Miscellaneous warning |
allow | A valid action for a particular resource | allow: GET, HEAD |
content-length | The length of the response message body, expressed in hexadecimal bytes | content-length: 348 |
content-location | A candidate location for the data returned | content-location: /index.htm |
proxy-authenticate | Requires authentication information when accessing the broker | proxy-authenticate: Basic |
public-key-pins | Used to prevent intermediate attacks and declare the certificate hash value of the transport layer security protocol in web site authentication | public-key-pins: max-age=2592000; Pin – sha256 = “…” |
vary | How should the downstream proxy server be told to match future request protocol headers to determine whether the cached response content can be used instead of re-requesting new content from the original server | vary: * |
via | Tell the client of the proxy server how the current response is sent | Via: 1.0 Fred, 1.1 Itbilu.com (nginx/1.6.3) |
www-authenticate oooooooooooooooooooooooooooo |
Represents the authentication mode that should be used when requesting this entity | www-authenticate: Basic |
2 MIME brief description of message content types
See MIME types for details.
Type/subtype representation | Name extension of the corresponding file |
---|---|
text/plain | txt |
text/html | HTML and HTM |
text/css | css |
text/javascript | All text JavaScript types have been deprecated by RFC 4329. |
application/javascript | js |
application/ecmascript | es |
image/webp | – |
image/apng | – |
application/xml; Q = 0.9 | Parameter Q representsThe weight, specifying the priority of content types. The range is a real number between 0 and 1,The default value is 1The minimum is 0.001 and the maximum is 1. (A value of 0 indicates that this content type is not accepted) |
application/xhtml+xml | – |
application/x-www-form-urlencoded | When the browser requests a submission,The default 的 encoding(that is, whether special characters are encoded and Spaces are replaced by + signs). |
multipart/form-data | The second form of encoding that the browser requests submission, that is, does not encode special characters and Spaces. Boundary is used instead of &, the value of boundary is—-Web… AJv3. This form is usually used forBinary data. Such asUpload a fileThe encoding format must bemultipart/form-data. |
application/json | The third encoding that the browser requests to submit. |
* / *; Q = 0.8 | all |
3 Brief description of compression formats
Compressed format | instructions |
---|---|
deflate | No patent compression algorithm, it can achieve lossless data compression, there are many open source implementation algorithms. Deflate compresses faster and uses less CPU. Deflate is an outdated form of web compression that browsers don’t support very well. |
gzip | The Apache 1.x series does not have built-in web compression technology, so it uses an additional third-party mod_gzip module to perform compression. Apache 2.x has built in mod_deflate to replace mod_gzip. Both use the Gzip compression algorithm, and they work similarly. Gzip has a slightly higher compression ratio and CPU usage |
4 Browser, server caching mechanism
Cache-control fields in browser and server headers have fixed values, just different objects.
Caching is performed by the server and browser using last-modified/if-modified-since or etag/if-none-match, but the latter takes precedence.
4.1 Description of cache-Control values in Request Headers
When a client sends a request to the server, it may pass through many layers of proxies, which may cache the desired file for the request. Cache-control in the request header controls whether to use the cached file in the proxy.
value | instructions |
---|---|
no-store | I don’t need a cache file in the proxy, I need to request the server directly. |
no-cache | The browser can cache the response file, but before using the cache, it must communicate with the server via a token (eTAG) to confirm that the cache is valid. |
max-age=xxx | Indicates that the agent is free to use the cached content for the next XXX seconds without the browser having to send the same request. When the time expires, the cache becomes invalid. |
4.2 Description of cache-Control values in Response Headers
value | instructions |
---|---|
no-store | Do not cache the corresponding content (even if eTAG and Last-Modified fields are present in the response header). |
no-cache | The proxy needs to check with the server that the cache is up to date if it wants to return a file to the browser cache (or if the browser is using the cache). |
max-age=xxx | Indicates that the proxy or browser is free to use the cached content for the next XXX seconds without the browser having to send the same request. This option is only available in HTTP 1.1 and has a higher priority if used with last-Modified. When the time expires, the cache becomes invalid. |
must-revalidation/proxy-revalidation | If the cached content fails, the request must be sent to the server/proxy for revalidation |
public | All content will be cached (both client and proxy) |
private | Content is only cached in private caches (i.e. only clients can cache, not proxy servers) |
5 HTTP status message
An error may occur when a browser requests a service from a Web server. The following is a summary of the status code.
- 1 xx: information
The message | describe |
---|---|
100 Continue | The server only receives part of the request, but once the server does not reject the request, the client should continue to send the remaining requests. |
101 Switching Protocols | Server translation protocol: The server converts compliance with a client’s request to another protocol. |
- 2 xx: success
The message | describe |
---|---|
200 OK | The request was successful (followed by the reply document for the GET and POST requests). |
201 Created | The request is created and the new resource is created. |
202 Accepted | The request for processing was accepted, but processing did not complete. |
203 Non-authoritative Information | The document has returned normally, but some of the reply headers may be incorrect because a copy of the document is being used. |
204 No Content | No new documents. The browser should continue to display the original document. When the user refreshes the page periodically, the Servlet can determine that the user document is sufficiently new. This status code is very useful. |
205 Reset Content | No new documents. But the browser should reset what it displays. Used to force the browser to clear form input. |
206 Partial Content | The client sends a GET request with a Range header, and the server completes it. |
- Xx: redirect
The message | describe |
---|---|
300 Multiple Choices | Multiple choices. List of links. The user can select a link to reach the destination. A maximum of five addresses are allowed. |
301 Moved Permanently | The requested page has been moved to the new URL. |
302 Found | The requested page has been temporarily moved to the new URL. |
303 See Other | The requested page can be found at a different URL. |
304 Not Modified | When the document is not modified as expected, the server tells the browserThe cache is not expired and can still be used. The client has the buffered document and makes a conditional request (typically providing an if-Modified-since header indicating the latest document the client wants by a specified date). The server tells the client that the originally buffered document can still be used. |
305 Use Proxy | The document requested by the customer should be retrieved through the proxy server specified in the Location header. |
306 Unused | This code was used for the previous version. It is no longer in use, but the code remains. |
307 Temporary Redirect | The requested page has been temporarily moved to the new URL. |
- 4xx: Client error
The message | describe |
---|---|
400 Bad Request | The server failed to understand the request. |
401 Unauthorized | The requested page requires a username and password. |
402 Payment Required | This code is not yet available. |
403 Forbidden | Access to the requested page is disabled. |
404 Not Found | The server could not find the requested page. |
405 Method Not Allowed | The method specified in the request is not allowed. |
406 Not Acceptable | The response generated by the server was not accepted by the client. |
407 Proxy Authentication Required | The user must first authenticate with a proxy server before the request can be processed. |
408 Request Timeout | The request exceeded the server wait time. |
409 Conflict | The request could not be completed due to a conflict. |
410 Gone | The requested page is not available. |
411 Length Required | “Content-length” is not defined. Without this content, the server will not accept the request. |
412 Precondition Failed | The preconditions in the request were assessed as failure by the server. |
413 Request Entity Too Large | The server will not accept the request because the requested entity is too large. |
414 Request-url Too Long | The server will not accept the request because the URL is too long. This happens when a POST request is converted into a GET request with long query information. |
415 Unsupported Media Type | The server will not accept requests because the media type is not supported. |
416 | The server could not satisfy the Range header specified by the customer in the request. |
417 | Expectation Failed |
- 5xx: Server error
The message | describe |
---|---|
500 Internal Server Error | Request not completed. The server encountered an unexpected condition. |
501 Not Implemented | Request not completed. The requested functionality is not supported by the server. |
502 Bad Gateway | Request not completed. The server received an invalid response from the upstream server. |
503 Service Unavailable | Request not completed. The server is temporarily overloaded or down. |
504 Gateway Timeout | The gateway timed out. |
505 HTTP Version Not Supported | The server does not support the HTTP protocol version specified in the request. |
6 HTTP Methods (GET, POST, etc.)
The two most common HTTP methods are GET and POST.
- GET: Requests data from the specified resource.
- POST: Submits data to be processed to a specified resource.
6.1 Comparison between GET and POST
GET | POST | |
---|---|---|
Data submission method | The requestedURLSent in | The request ofHTTP message bodySent in |
The historical record | Be recorded | It won’t be recorded |
bookmarks | Bookmark | Do not bookmark |
The cache | Can be cached | Can’t cache |
Data length limit | The maximum length of URL is2048 characters | unlimited |
Back button/refresh | Use the cache | The data will be resubmitted (browsers should inform users that the data will be resubmitted). |
encoding | application/x-www-form-urlencoded | Application/x – WWW – form – urlencoded or multipart/form – the data |
Restrictions on data types | Only allowASCII characters, non-ASCII characters need TO be URL encoded | There is no limit. Binary data is also allowed. |
visibility | The data is visible to everyone in the URL. | The data is not displayed in the URL. |
security | GET is less secure than POST because the data sent is part of the URL. Never use GET! When sending passwords or other sensitive information. |
POST is more secure than GET because parameters are not saved in browser history or Web server logs. |
6.2 Other HTTP Request Methods
methods | describe |
---|---|
HEAD | Same as GET, but only the HTTP header is returned, not the body of the document. |
PUT | Uploads the specified URI representation. |
DELETE | Deletes a specified resource. |
OPTIONS | Returns HTTP methods supported by the server. |
CONNECT | Convert the request connection to a transparent TCP/IP channel. |
7 GMT time string (and method of converting to custom time format)
GMT Time format: Wed, 20 Jun 2007 22:33:00 GMT
Note:
- When the Date object is printed directly, it is automatically converted to a string in GMT format.
- To customize the string format, you need to manually assemble the desired format string from a Date object.
new Date()
The constructor accepts a Date string (including GMT format) to build a Date object of the specified Date.
GMT time format conversion example code:
// Here we use the GMT format string to build the Date object, and then customize the output format
GMTToStr(gtmStr) {
let date = new Date(gtmStr);
let str=date.getFullYear() + The '-' +
(date.getMonth() + 1) + The '-' +
date.getDate() + ' ' +
date.getHours() + ':' +
date.getMinutes() + ':' +
date.getSeconds();
return str;
}
// The Date object is built with a time format string, and the output object is automatically built as a GMT format string
StrToGMT(timeStr) {
let GMT = new Date(time);
return GMT;
}
/ / test
// GMT to customize
Print() {
let DateTime='Thu Jun 22 2017 19:07:30 GMT+0800'
let a=this.GMTToStr(DateTime)
console.log(a)
}
// Output: 2017-6-22 19:7:30
// Time format string changed to GMT
Print(){
let DateTime='the 2017-6-22 19:7:30'
let a=this.StrToGMT(DateTime)
console.log(a)
}
// Output: Thu, Jun 22 2017 19:07:30 GTM+0800
Copy the code
8 Uniform Resource Locator URL
URL: Uniform Resource Locator, also called URL. Consists of words (protocol + domain name + port number + path) (An Internet protocol address (IP) can be used instead of a domain name, for example, 192.168.1.253). When surfing the Web, most people type in the domain name of a web address because names are easier to remember than IP numbers.
Example: http://www.w3school.com.cn/html/index.asp
Grammar rules: scheme: / / host. Domain: port/path/filename
- scheme: Defining the Internet
agreement
The type of. The most common types are HTTP (Hypertext Transfer Protocol), HTTPS (Secure Hypertext Transfer Protocol), FTP (File Transfer Protocol), File (Local Resource Protocol), etc. host
Definition:Domain host
The default host for HTTP is WWW, which is used to specify the principal domain name.- domain: Defining the Internet
The domain name
, such asw3school.com.cn
. (includingTop-level domain, second level domain
Etc.) - :port: Defines the host
The port number
(The default HTTP port number is 80, and the default server port number depends on the server type). - path: Defines those on the server
The path
(If omitted, the document must be located on the web siteThe root directory
).
8.1 Whether the WWW is configured for the domain host
- If you have aTop-level domain namesThere areThe secondary domain nameIf so, it is best to set in front of the domain nameDomain host WWW, which is used toClarifying the dominant position. (mainly
A small company
.The website is not much
.Share a top-level domain name
In the case of- Student: If you use thetamultiple
Top-level domain namesTo manage different sites, generallyNo configurationDomain host WWW, which makes it easier for users to use. (
A large company
Will useDifferent top-level domains
To manage theDifferent websites
.Convenient management
, such as Huawei)About how to configure, need to set in the server, to be added.
9 Character set type
Overview:
- American Standard ANSI:ASCII code -> ANSI codeSupport for multiple languages, such as Chinese
GB2312
And so on.Incompatible between different ANSI encodings).- International Standard ISO:The ISO code(Support for multiple languages (The value is limited to 1 byte and cannot contain Chinese characters), but the character set varies by locale, such as
ISO-8859-1
Scope of useNorth America, Western Europe, Latin America, Caribbean, Canada, Africa
And so on.Different ISO codes are incompatible).- Unicode alliance:Unicode(To solve the compatibility problem described above, each symbol in the world is given a unique code, but each character is represented by two or four bytes, resulting in
Waste of resources
)-> Utf-8 encoding, etc(longer
In order to solve the previousWaste of resources
. It can use 1 to 4 bytes to represent a symbol, varying the length of the byte depending on the symbol.
Variable-length encoding for UTF-8:
- Characters in the ASCII range are represented by 1 byte. This is because UTF-8 retains a one-byte ASCII character encoding as part of its code, so utF-8 will always have one-byte ASCII characters. Think of UTF-8 as an extension of ASCII.
- When characters such as Chinese characters are encountered, they are represented by multiple bytes.
- It’s worth noting that,UnicodeOne of the codes
Chinese
Characters of2 bytes
And theUTF-8aChinese
Characters of3 bytes
.The reason is that Unicode encodings only consider encodings, while UTF-8 encodings consider not only encodings but also storage (such as embedded 1-byte ASCII characters). 4. Unicode to UTF-8 is not a direct correspondence, but is converted by algorithms and rules. 5. In computer memory, Unicode encoding is used uniformly. When saving to hard disk or transferring, utF-8 encoding is converted. 6. For example: When editing with Notepad, utF-8 characters read from the file are converted to Unicode characters in memory. After editing, Unicode is converted to UTF-8 and saved to the file. 7. The default browser encoding is ISO-8859-1.
History:
- Early use of the World Wide Web
Character set
是 ASCII. (The character ‘set includes both characters and symbols, collectively called character)- Since many countries use characters that do not belong to ASCII, the default character set of modern browsers is the ISO international standard, such as ISO-8859-1.
- Therefore, if a web page uses a different character set than isO-8859-1, etc., it should be specified in the tag to tell the browser how to decode it.
9.1 ASCII Character Set (1 byte, 128 characters, entity number, entity Name) (excluding Chinese)
- HTML and XHTML use standard 7-bit ASCII code to transfer data over the network. Supports numbers from 0 to 9, uppercase and lowercase letters, and special characters.
7 bit ASCII
Code available128
A differentCharacter values
.2 ^ 7 = 128.- ASCII codeonly
Low seven
And thehighThe sign bitalways 为0
. The reason is:The complete ASCII encoding range can have 256 bits, but it is still not enough to represent Chinese and Japanese characters, so the high part is used as a reserved symbol bit. 4. The ASCII extended character set, or ANSI character set, can represent the characters of other countries, such as Chinese, when the high level of the reserved symbol bit is 1 and two bytes are used. We’ll talk about that in the next section.
ASCII characters can be represented by entity numbers if they are inconvenient to type directly, for example (part) :
ASCII characters | The entity number |
---|---|
The blank space | & # 32; |
! |
& # 33; |
" |
& # 34; |
# |
& # 35; |
$ |
& # 36; |
For other symbols, refer to the ASCII reference manual.
Note: Distinguish between entity numbers and names: all ASCII characters have entity numbers, but only some have entity names. Common ones are as follows:
ASCII characters | The entity number | The entity name |
---|---|---|
" |
& # 34; |
" |
& |
& # 38; |
& |
' |
& # 39; |
' |
< |
& # 60; |
< |
> |
The & # 62; |
> |
9.2 ANSI Character Set (2 bytes, 65536 bytes, incompatible by country)
ANSI
The character set is an extension of ASCII.ANSI
Encoding to use0x00~0x7f(that is, 0 to 127 in decimal notation)1 byte
To represent oneThe English characters
.Use the 0x80 to 0xFFFF range to represent other characters in other languages. 3. Different countries have different ANSI character set standards. For example, China has developed the GB2312 code, which is used to encode Chinese characters. Japan codifies Japanese to Shift_JIS; Korea has incorporated Hangul into EUC-KR. ANSI codes in different languages cannot be converted to each other, resulting in garbled text in a multilingual mix. 4. Similar to the ANSI character set, which can support multiple national languages, there is the international standard ISO character set, but does not support Chinese characters, etc. We’ll talk about that in the next video. 5. Unicode was created to solve the problem of ANSI coding conflicts between different countries. We’ll talk about that in the next video.
9.3 GB2312, GBK encoding (ANSI Chinese version, 2 bytes, 65536 bytes)
GB2312
: Chinese National Standard Simplified Chinese Character Set. For names, ancient Chinese and other rare words,GB 2312 cannot be processed
This led to the laterGBK
.- GBK: Code extension specification for Chinese characters. Using 2 bytes, it is smaller than **UTF-8 (3 bytes) ** in terms of the storage footprint of Chinese characters.
GBK is compatible with GB 2312 downwards and supports ISO 10646 international standard upwards, which plays a connecting role in the transition process from the former to the latter.
9.4 ISO Character Set (ISO-8859-1, etc., 1 byte, 256 characters, entity number, entity name)
- The ISO character set is a standard character set defined by the International Standards Organization (ISO) for different alphabet/languages.
- Different regions use different ISO character sets and are incompatible, including
ISO-8859-1
Scope of useNorth America, Western Europe, Latin America, Caribbean, Canada, Africa
.- However, as is
Single-byte encoding
Is the same as a computer’s most basic unit of representation, so on many protocols,By default, isO-8859-1 is used
.- The default code for the browser page is’ ISO-8859-1 ‘.
- The default encoding for most browser urls is’ UTF-8.
HTML5
The default character encoding isUTF-8
.
- Lower part of ISO-8859-1 (
Code from 1 to 127
) was the first 7-bit ASCII,Parts have character entities, see the ASCII character entity table above. - Higher part of ISO-8859-1 (
Code from 160 to 255
)All have entity names, see the following table (part) :
Iso-8859-1 Contains higher characters | The entity number | The entity name |
---|---|---|
Uninterrupted space (space) | The & # 160; |
|
selections (RMB) |
The & # 165; |
¥ |
© (copyright) |
The & # 169; |
© |
® (Registered trademark) |
The & # 174; |
® |
x (multiply) |
The & # 215; |
× |
present (devide) |
The & # 247; |
÷ |
Unicode characters: | ||
™ (trademark) |
The & # 8482; |
™ |
For more information about isO-8859-1, see the ISO-8859-1 Reference manual.
9.5 Unicode encoding (2 or 4 bytes. Utf-8, UTF-16, 1-4 bytes) (including Chinese), encoding representation method
- Due to the
ANSI code
和The ISO code
, have multiple versions and are incompatible. soUnicode alliance
developedThe Unicode standard
, using the standard Unicode conversion formatreplace
All existing character sets.Unicode encoding all characters are represented by 2 or 4 and bytes. 2. The Unicode encoding covers all characters in the world, and is cross-platform. 3. Unicode has been implemented in XML, Java, ECMAScript (JavaScript), LDAP, CORBA 3.0, WML. Unicode is also supported in many operating systems and in all modern browsers. 4. The Unicode Consortium works with leading standards development organizations, such as ISO, W3C, and ECMA. Unicode can be compatible with different character sets. The most common encoding methods are UTF-8 and UTF-16. 5. The first 256 Unicode characters correspond to 256 ISO-8859-1 characters. 6. Unicode to UTF-8 is not a direct correspondence, but is converted by algorithms and rules. 7. Unicode is implemented differently than it is encoded. The Unicode encoding of a character is determined. However, in the actual transmission process, because the design of different system platforms is not necessarily the same, and for the purpose of saving space, the implementation of Unicode encoding is different. The Unicode implementation is called Unicode Translation Format (UTF for short). 8. If a Unicode file contains only basic 7-bit ASCII characters, the 8-bit of the first byte is always 0 if each character is transferred using the 2-byte original Unicode encoding. This creates a relatively large waste. In this case, utF-8 encoding can be used, which is a variable-length encoding that still represents the base 7-bit ASCII characters in 7-bit encoding, occupying one byte (the prime complement of 0). In the case of mixing with other Unicode characters, it will be converted according to a certain algorithm, with each character encoded in 1-3 bytes and identified by the first 0 or 1. This greatly reduces encoding length for Western documents that are dominated by 7-bit ASCII characters (see UTF-8 for a solution). Similarly, 2-byte encoding of UTF-16 will need to be converted through algorithms for 4-byte auxiliary flat characters and other UCS-4 extension characters that will appear in the future.
Representation of Unicode encodings :(in the case of compatible other encodings, the Unicode code value of the same character is the same as that of the other encodings)
The environment | Unicode encoding representation |
---|---|
In the HTML said | & # + The decimal Unicode code/other code value +; |
Js said in | \u Plus hexadecimalfour Unicode code\x + up to 2 bits of other hexadecimal code values\ 0 + other code values in base 8 up to 2 bitsAnd other TYPES of JS conversion methods |
CSS said | \ Plus hexadecimalfour Unicode code |
To get Unicode encoding:
// Get the unicode encoding in base 10, parse the output
'Ann'The charCodeAt ();// The output 23433 is the Unicode encoding of the Chinese character, but note that it is base 10
String.fromCharCode(23433); // Output 'Ann'
// Convert to hexadecimal and parse the output
var unicode = '\\u'+'tea'.charCodeAt().toString(16); // Output string: "\u8317"
JSON.parse('"'+unicode+'"'); // Output Chinese characters: "Ming"
eval('"'+unicode+'"'); // Eval parsing can also be used
Copy the code
- UTF-8
- Characters in UTF-8
1-4
byteLonger said.Chinese characters take up three bytes.- Utf-8 can represent the Unicode standard
Any character
.- Utf-8 is
Web page
和UTF-8
在English
And so on the encoding of a single byte, thanUniicode
Coding takes up less storage.GBK
在Chinese
On the treatment of, thanUTF-8
Coding takes up less storage.
- UTF-16
- Utf-16 is similar to UTF-8, but its encoding is exactly the same as Unicode encoding and is mainly used forThe operating system 和 The environmentMicrosoft, for example
Windows 2000/XP/2003/Vista/CE
As well asJava
和.NET
Bytecode environment, etc.- Because byte order is interpreted differently by different machine environments, the same byte stream may be interpreted differently.
Therefore, the concepts of big-endian and little-endian and the Byte Order Mark (BOM) solution are used in UTF-16 encoding implementation. (For details, see UTF-16.)
In The Notepad that comes with Microsoft Windows, there are four encoding options available in the Save as dialog: In addition to the non-Unicode ANSI encoding, the other three Unicode encoding, Unicode Big Endian and UTF-8, correspond to the original Unicode encoding, UTF-16 and UTF-8 respectively.
10 Default encoding format of URL characters (including GET and POST)
Request header
In thecontent-typeField determines whether the URL is true or notSpecial characters
coding
.Here is the encoding format when coding is required:
- Analysis:
Urls can be sent directly over the Internet using part of the ASCII character set
That is, most ASCII characters can be passed through the URL without encoding. It can be in the URLdirectly
ASCII characters include:0-9, a-z, a-z, [,], (,), -, _,., +, *, ‘, $! And so on.
- However,
All characters except these characters
, must be carried outcodingCan be passed in the URL.
10.1 ENCODING Format of URL Characters
Almost all browsers generally use UTF-8 encoding for urls, representing each byte individually. Urls cannot contain Spaces, and Spaces are usually replaced with +.
URL encoding format | expression | instructions |
---|---|---|
UTF-8 | % + two 的 Hexadecimal number 的 Utf-8 encoding | 1. According to1a byte.2. The default URL encoding mode of the browser. |
Unicode | %u + four 的 hexadecimal 的 Unicode |
1. To represent1a character.2. This is also the JS encoding method escape() Is not recommended. |
Common URL character encoding table:
character | ASCII | Utf-8 encoding of the URL | Unicode encoding of URL |
---|---|---|---|
enter |
13 |
%0D | %u000D |
A newline |
10 |
%0A | %u000A |
Chinese | Chinese characters have no ASCII code | %e4%b8%ad %e6%96%87 | %u4e2d %u6587 |
## 10.2 The default encoding of pathInfo(non-parameter part) and queryString(parameter part) in urls in different browsers |
- The serverSet up theThe page codeIn addition to affecting
Page display
, but also affectGETIn the requested URLThe queryString parameters
Part of the code.- Due to mainstream browser pairs
PathInfo nonparametric
The default encoding for the section isUTF-8
, so weJust need to care
GET request URLIn thequeryStringCode.
Solution:
Browser URL encoding
: server to return the page, setencoding
为UTF-8
Can.Server URL decoding
: The server modifies the configuration to the URL andGETRequest parameterencoding
为UTF-8
Can.For example, the Default encoding mode of the Tomcat server is UTF-8 for POST requests, but iso8859-1 for GET requests.
Here are the results of a browser test on the default URL encoding without setting the page encoding:
The browser | PathInfo coding | The queryString coding |
---|---|---|
A GET request: | ||
IE | UTF-8 | GB2312 (Local environment related) |
Chrome | UTF-8 | UTF-8 |
FireFox | UTF-8 | UTF-8 |
A POST request: | ||
Chrome | UTF-8 | UTF-8 |
FireFox | UTF-8 | UTF-8 |
10.3 URL Manual Encoding Solution
JavaScript functions are commonly used to manually encode urls. Common methods include escape(), encodeURI(), and encodeURIComponent().
Js URL encoding method | encoding | Uncoded character | The sample | note |
---|---|---|---|---|
escape() |
Unicode | No coding69A:* + - . / @ _ 0-9 a-z A-Z |
Var url = escape("http://www.baidu.com/ Spring Festival "); http%3A//www.baidu.com/%u6625%u8282 |
The old function processing mode is not recommended ooooooooo |
encodeURI() |
Utf-8 encoding | No coding82A:! ' ( ) * - . _ ~ 0-9 a-z A-Z # $& +, / :; =? @ |
Var url = encodeURI("http://www.baidu.com/ Spring Festival "); http://www.baidu.com/%E6%98%A5%E8%8A%82 |
It is recommended to use |
encodeURIComponent() More encoded characters |
Utf-8 encoding | No coding71A:! ' ( ) * - . _ ~ 0-9 a-z A-Z contrast encodeURI() , the additional encoding 11 characters are:# $& +, / :; =? @ |
Var url = window. EncodeURIComponent (" http://www.baidu.com/ "Spring Festival). http%3A%2F%2Fwww.baidu.com%2F%E6%98%A5%E8%8A%82 |
It is recommended to use |
10.4 The Server Configures the URL decoding mode and controls the URL encoding mode of the browser
The default encoding mode of the PARAMETERS of the GET request URL of the Tomcat server is ISO8859-1. We need to change it to the encoding mode of the foreground URL (generally, the foreground will manually encode it as UTF-8), so we need to set it to UTF-8. Tomcat decodes POST requests using UTF-8 by default.
Related instructions:
- The data submitted by GET is in the URL, so it has already been encoded and decoded by the time it reaches the server. To modify the decoding mode, only
Modifying server Configurations
, or manuallyEach of the parameters
forCodec conversion
.- The data submitted by POST can be used after it is encoded by the browser and arrives at the serverrequest.setCharacterEncoding(“UTF-8”);Set the decoder format separately, if not, the server will be used
Default decoding format
.
codec | Request type | way |
---|---|---|
decoding | GET | 1. Manually convert parameters:String queryStr = new String(request.getParameter("queryStr").getBytes("ISO8859-1"), "UTF-8"); 2. Alternatively, change the server configuration to the default value ISO8859-1 为 UTF-8 . Using Tomcat as an example, modify the server. XML file:The < Connector port = "8080" protocol = "HTTP / 1.1" connectionTimeout="20000" redirectPort="8443" URIEncoding="UTF-8"/> |
Post | Generally, no configuration is required. To be safe, you can set the decoding format for the parameters of the server: request.setCharacterEncoding(“UTF-8”); If not, the server will use the default decoding format, UTF-8 or something else. And then passrequest.getParameter()To get the parameters. |
|
coding | All requests | 1. When the server sends data, it encodes it,The default encoding of the server is ISO-8859-1. Setting method:response.setCharacterEncoding(“UTF-8”); In particular, specify pageEncoding at the top of the page number to set the server encoding: How it works: The translation of JSP ->.java files is performed by the middleware container, the Tomcat server, which encodes the data by default ISO-8859-1 , so you need to set pageEncoding to change the encoding:The < % @ page pageEncoding = “utf-8” % > 2. Then there is needTell the browser , the encoding format of the response content.The first is directly in <meta /> TAB to set the page encoding format:<meta http-equiv=”content-type” content=”text/html; charset=UTF-8 /> The second is to set the response header directly content-type Properties:response.setHeader(“content-type”, “text/html; charset=UTF-8”); Special, can be specified at the same timeHow the server is encoded, and SettingsBrowser decoding mode, instead of the above two steps, it is more convenient: response.setContentType(“text/html; charset=utf-8”); |
10.5 Summary: Codec process of the server <-> browser
Server -> Browser
- response.setContentType(“text/html; charset=utf-8”);: Server Settings
The way data is encoded
And,Tell the browser how it is encoded
. (The default server encoding format is ISO-8859-1
)This step can be broken down into two steps: the response. The setCharacterEncoding (” utf-8 “); , the response. SetHeader (” the content-type “, “text/HTML. charset=UTF-8”); 2. The browser decodes the page according to the encoding mode specified by the server. (The default browser decoding format is ALSO ISO-8859-1, some also utF-8)
Browser -> Server
- Browser according to
Encoding mode specified by the server
The request is encoded and sent to the server. (The default URL encoding mode of the browser is UTF-8, and some urls are local environment, such as GB2312
)- After the server receives the browser request 🙁
By default, the Tomcat server decodes THE GET request to the URL isO-8859-1 and the POST request to utF-8
)For GET request: When reaching the server, the server has followed the default decoding mode ISO-8859-1, and the decoding is complete. String queryStr = new String(request.getParameter(“queryStr”).getBytes(“ISO8859-1”), “UTF-8″); B. You can also modify the configuration to change the default server decoding mode to UTF-8. For POST requests: The server has not decoded the data when it arrives at the server. A. decoding way: although you can now set the request. SetCharacterEncoding (” utf-8 “); . The Tomcat server defaults to UTF-8 for POST requests. Plus, of course, it’s safer. 5. Finally, the server uses request.getParameter() to get the parameters sent by the browser.
HTTP Referer anti-theft chain and anti-theft chain
Use it and summarize it.
12 Website host selection
12.1 Issues to Be Considered when Setting up a Server by Yourself
- Hardware spending: To run “real” websites, you have to buy powerful ones
Server hardware
. Don’t count on a low-priced PC to do the job. You also need toStable (24 hours a day) high-speed connection
. - Software spending: Remember,
Server authorization
Usually more expensive than client-side licensing. Also note that server authorization may existUser limit
. - Artificial costDon’t count on low labor costs. You must install your own
Hardware and Software
. At the same time you want toDeal with bugs and viruses
To ensure that your server is running properly in an “anything can happen” environment at all times.
12.2 Using an Internet Service Provider (ISP)
Most small businesses host their websites on servers provided by isPs.
Advantages of ISP:
- Powerful hardware: ISP web servers are usually so powerful
Ability to share resources by several websites
. You should also look to see if your ISP provides efficientLoad balancing
, and necessaryBackup server
.- The connection speed: Most ISPs have Internet connections
High-speed connection
.- Safety and reliabilityIsps are experts in web hosting. They should provide
99%
The aboveOnline time
.Latest software patches
, as well asBest virus protection
.
Considerations for choosing an ISP:
- traffic: Research your ISP’s
Traffic restrictions
. If there is an unexpected spike in traffic due to your site’s popularity, make sure you don’tPay extra
.- Bandwidth or content restrictions: Research your ISP’s
bandwidth
和Content restrictions
. If you plan to publishThe picture
Or broadcastVideo or
Audio, please make sure you have this permission.- Database access: If you plan to use
Website database
Make sure your ISP supports youRequired database
Access.- Daily backup: Make sure your ISP does it
Daily backup
Or you risk losing valuable data.- E – mail function: Make sure your ISP supports what you need
E - mail function
.- 24-hour support: Make sure your ISP provides
24 hours
Support. Don’t put yourself in the awkward position of not being able to solve a serious problem while having to wait for a second workday. A toll-free phone service is also necessary if you do not wish to pay for long distance calls.
For others, see Introduction to Network hosts and various servers.