In Restful service design, it is often necessary to use Chinese characters as parameters in THE URL address. In this case, it is generally necessary to set and encode Chinese character information correctly.


1. The introduction of questions

In Restful service design, the common URL for querying certain information is get /basic/service? Keyword = history, and so on. However, in the actual development and use, garbled codes do occur. The keyword information read in the background is garbled and cannot be read correctly.

2. How is garbled code generated?

As we use URL to transfer parameters, this method is dependent on the browser environment, that is, THE URL and the key=value format of the transfer parameters contained in the URL are processed in the browser address bar and the corresponding encoding is passed to the background for decoding.


Since we did not do any processing, when javascript requests the URL and sends the parameter in Chinese (that is, when Chinese is entered in the input box), the encoding of the Chinese parameter of the URL is carried out according to the browser mechanism. At this time, there is a garbled code problem.


3. For initial coding, encodeURI() method is used in javascript.


When encodeURI() is used to encode Chinese URL parameters in javascript, the word “test” is converted to “%E6%B5%8B%E8%AF%95”. But problems remain. The reason is that in the encoded string information, the browser mechanism will consider “%” as an escaped character, and the browser will process the escaped characters between the converted parameters “%” and “%” passed in the URL of the address bar and pass them to the background. This will result in a mismatch with the actual encodeURI() encoded URL, because the browser misinterprets “%” as an escape character and does not consider “%” to be a normal character.
4. Secondary encoding, using encodeURI
Operation: encodeURI (encodeURI (“/order? name=” + name));

The processed URL is no longer the string “%E6%B5%8B%E8%AF%95” converted by encodeURI(), but the string “%25E6%B255%258B%25E8%AF%2595” processed by encodeURI() in the previous step. The “%” that was originally parsed as an escape character by re-encoding is re-encoded and converted into a normal character to “%25”.
At this point, the front-end javascript code of THE URL encoding with Chinese has been completed, and it is passed to the background to wait for processing by the way of URL parameter transfer. The Action obtains the parameter of normal conversion without gargle as “%25E6%B255%258B%25E8%AF%2595”, This string corresponds to the Chinese character “test” that we typed.

5. How to correctly parse Chinese character information in the background?

After the second encodeURI(), it is not possible to retrieve the correct information directly. Proceed as follows:

URLDecoder.decode("chinese string"."UTF-8")  Copy the code

URLDecoder decode(String STR,String ECN) method has two parameters, the first parameter is the String to decode, the second parameter is the corresponding encoding when decoding.

6. encodeURI, encodeURIComponent, escape

6.1 the escape () function

The escape() function encodes the string so that it can be read on all computers.


Return value: a copy of the encoded string. Some of these characters are replaced with a hexadecimal escape sequence.


Note: This method does not encode ASCII letters and numbers, or the following ASCII punctuation marks: -_.! ~ * ‘(). All other characters are replaced by escape sequences. All Spaces, punctuation, special characters, and other non-ASCII characters are converted to the %xx character encoding (xx is the hexadecimal number of the character’s encoding in the charset table). For example, the encoding for the space character is %20. Characters that are not encoded by this method: @ * / +


6.2 encodeURI () method

Convert the URI string to escape in UTF-8 encoding. Characters not encoded by this method:! # $& * () = : /; ? + ‘


6.3 encodeURIComponent () method

Convert the URI string to escape in UTF-8 encoding. This method encodes more characters, such as /, than encodeURI(). Therefore, if the string contains several parts of the URI, this method should not be used, otherwise the URL will display an error if the/character is encoded.


Characters not encoded by this method:! * () ‘


Therefore, for Chinese strings, if you don’t want to convert the string encoding to UTF-8 (for example, if the charset of the source page is the same as that of the destination page), just use Escape. Use encodeURI or encodeURIComponent if your page is GB2312 or other encoding and the page that accepts parameters is UTF-8 encoding.

Having said that, I usually use the following scheme:

7. Another Chinese garbled URL processing scheme (recommended)

The characters in the request side have encodeURI transcoding once, for example: var URL =”/ajax? name=”+encodeURI(name); Server side code: name=new String(name.getBytes(” ISO8859-1 “),”UTF-8”); Note: Name is the obtained string, and ISO8859-1 is the default character encoding of the project. If the encoding is Chinese GBK, GB2312, etc., this step is not required.

Analysis: through the program verification, the result is feasible. Therefore, the default encoding method of the browser itself is ISO8859-1. Even though encodeURI is used for UTF-8 encoding, the main string contents, such as ASCII characters and visible characters, are still based on the characters of the browser itself. The reason is that these characters are coincident with utF-8 strings. Escape functions such as encodeURI are used to escape special characters such as % and /.