When I was having a question with my friend, he pointed to a string of strings and asked me: do you know why there are two equal signs (=)?
Me: I don’t know……
He: Because this is base64 encoding, @##@%^&*&%#@@#$%^ is not enough digits, so use = complement.
Me: Oh ~ (nods to pretend to understand)
The friend, I suddenly remembered before the interview, also appears to be the interviewer asked base64 encoding related problem, then in addition to know that this is a way of encoding and the front end will be using it for image coding, for its specific encoding rules (WHAT, according to, HOW) all don’t understand, so today went to watched it again carefully, Let’s tidy it up here. BTW, the reason to learn something is not because it will be asked in an interview, but because…… To learn is to earn. ^ ^
A, WHAT
What is Base64 encoding? It’s a way of representing binary data based on 64 printable characters. Which 64 printable characters?
[‘ A ‘, ‘B’, ‘C’… ‘A’, ‘B’, ‘C’… ‘0’, ‘1’,… ‘+’, ‘/’] 26 (26 + + 10 + 2 = 64)
These 64 characters can be called base64 index tables, as stipulated by the standard Base64 protocol. Why is it called Base64? Because it is based on these 64 printable characters, it is called base64 because characters are represented by 6-bit binary numbers (26=642^6=6426=64). Similarly, base32 is a 5-bit binary number, base16 is a 4-bit binary number to represent characters.
Second, according to
Why use this encoding method? Isn’t it binary in nature? Why not just transfer the binary and convert it base64?
One might think base64 is for encryption — but it’s not, because if we look a little bit at its encoding rules, it can be easily decoded backwards. (Of course, you can do this by shuffling the encoding order, which we won’t discuss here.) So why base64? The real reason is binary incompatibility. Some binary values, on some hardware or software, such as different routers, older computers, older software, network protocols, represent different meanings and do different processing. This results in incompatible binary values being mishandled, which is definitely not the case. So by first encoding the transmitted data in Base64 and turning it all into visible characters (binary values compatible with each software and hardware), errors are much less likely.
Third, HOW
So how is base64 encoded?
- STEP 1: Divide the string to be converted into a group of three bytes each, and each byte takes up 8 bits, so there are 24 binary bits in total. (Why groups of three bytes here? Since each byte is 8 bits, and base64 is a 6-bit binary representation of characters, the least common multiple of 8 and 6 is 24, three bytes is just fine.)
- STEP 2: Divide the above 24 bits into 4 groups of 6 bits each.
- STEP 3: Add two zeros to the front of each group (note that the length increases after encoding). Each group changes from six to eight bits, for a total of 32 bits, or four bytes.
- STEP 4: Obtain the corresponding values according to the base64 encoding comparison table below.
0A17R34i51z 1B18S35j520 2C19T36k531 3D20U37l542 4E21V38m553 5F22W39n564 6G23X40o575 7H24Y41p586 8I25Z42q597 9J26a43r608 10K27b44s619 11L28c45t62+ 12M29d46u63/ 13N30e47v 14O31f48w 15 p32g49x 16 QCopy the code
The characters in the Base64 character table can be represented with six bits originally, but now two zeros are added in front of it to become eight bits. Therefore, the text encoded in Base64 will increase by one third compared with the original text.
For example, 🌰 : Base64 encoding Hah:
- The CORRESPONDING ASCII code values of H, A, and H are 72,97,104 respectively, and the corresponding binary values are 01001000, 01100001, and 01101000. As shown in the second and third lines of the figure below, a 24-bit binary string is formed.
- Divide 24 bits into four groups of six bits.
- Add two zeros to the front of each of the above groups to expand to 32 binary bits, which become four bytes: 00010010, 00000110, 00000101, 00101000. The corresponding values (base64 encoded index) are 18, 6, 5, and 40.
- Use the above values to search in the Base64 encoding table, corresponding to: S, G, F, O respectively.
The text | H | a | h |
---|---|---|---|
ASCII | 72 | 97 | 104 |
Binary values | 01001000 | 01100001 | 01101000 |
6 bit a group | 010010, 000110, 000101, 101000 | ||
Base64 index | 18, 6, 5, 40 | ||
Base64 encoding | S, G, F, O |
So the Hah base64 encoding becomes SGFo.
We mentioned above that the first step of encoding is grouping “every three bytes”. What if the character to be converted does not have three bytes? For example, “A” and “BC” are now base64 encoded ~
One byte: A byte contains eight binary bits, still grouped according to the rules. At this time, there are 8 binary bits in total, 6 bits in each group, then the second group is missing 4 bits, and 0 is used to complement, to get two Base64 encoding, and the latter two groups have no corresponding data, and use “=” to fill. Therefore, “A” is encoded as “QQ==”;
Two bytes: Two bytes are 16 bits, still grouped according to the rules. At this time, there are 16 binary bits in total, and each group is 6 bits. Then, 2 bits are missing in the third group, and 0 is used to complement to get three Base64 encodings. If the fourth group has no data at all, “=” is used to complement. Therefore, “BC” is encoded as “QKM=”.
(Finally understand the beginning of the friend meant ~)
Fourth, front-end scene — picture coding
As we know, every image we see on a web page requires an HTTP request to be downloaded. It would be nice if images could be downloaded locally along with HTML instead of being requested to the server. Base64 can solve this problem.
For example, a Base64-encoded image address might look like this:
// I'm just a 🌰 data:image/ GIF; base64,R03t7txgBjboSvB8EpLoFZywOAo3LFE5lYs/QW9LT1TRk1V7S2xYJADs=
We can put it in HTML or CSS and get the image without asking the server.
It should be noted that one of the problems with base64 as mentioned earlier is that the volume increases by 1/3 after encoding, so using Base64 does not necessarily improve performance, depending on the specific scenario. For example, if the image is very small and cannot be made into Sprite due to its specific use, it is highly reusable and does not change much throughout the project, then it may be possible to transfer the image (such as the background image of the page) using Base64 encoding.