Origin of the problem

So let’s think about a question here, can URLS be scribbled? If not, what should the URL be?

RFC 1738 states that urls must use only English letters, Arabic numerals, and certain punctuation marks. *'(), “[excluding double quotes], cannot use other characters and symbols.

This means that if there are Chinese characters in the URL, they must be encoded and used. The trouble is that RFC 1738 does not specify how to code, leaving it up to the application (the browser) to decide. This makes “URL coding “a confusing field.

In particular, different browsers and operating systems may have completely different coding rules.

For example, in Ajax calls, IE always uses GB2312 encoding (the default encoding of the operating system), while Firefox always uses UTF-8 encoding.

The solution

Then there is no uniform code rules for programmers is a nightmare. So is there a way to ensure that clients only use one encoding method to make requests to the server?

The answer must be yes. The idea is to encode the URL in JavaScript and then submit it to the server without giving the browser a chance to wipe its hands.

escape

The escape function is the oldest encoding function and is not recommended today.

Escape is not actually used for URL encoding directly; what it really does is return the Unicode encoding value of a character.

Coding rules:

  • Encodes all characters except ASCII letters, digits, and punctuation marks “@ * _ + -. /”.
  • The symbols between \u0000 and \ u00FF are converted to %xx, and the remaining symbols are converted to %uxxxx.

encodeURL

EncodeURI () is the function that actually encodes urls in Javascript.

It looks at encoding the entire URL, so that in addition to the common symbols, some other symbols have special meaning in the URL “; /? : @ & = + $, #”, also does not encode. After encoding, it outputs the utF-8 form of the symbol and prefixes each byte with %.

encodeURIComponent

The last Javascript encoding function is encodeURIComponent(). It differs from encodeURI() in that it is used to encode parts of a URL individually, not the entire URL.

Therefore, “; /? : @ & = + $, #” symbols that are not encoded in encodeURI() are encoded in encodeURIComponent(). As for the specific coding method, the two are the same.