1. What is binary data?
- The letter L corresponds to the number 76
- “L”.charCodeAt(0) === 76
- And then we convert 76 to binary 01001100
- Character set: Specifies the character relationships represented by the numbers in the character set
- Unicode or ASCII
- Utf-8 is 8 bits
- So 76 for L is 01001100.
- If the value is 12, the value is 00001100
- A byte is a unit of binary data. A byte is usually 8 bits long
Byte conversion
- Common storage units include bit, B, KB, MB, and GB. There are mainly conversion relations between them as follows:
- 1B=8bit
- 1KB=1024B
- 1MB=1024KB
- 1GB=1024MB
- B is short for Byte.
- Bit represents the smallest unit, which can be understood as a cell that can only store 1 or 0
- Byte = 8 bits
- In real life, merchants sell 1GB hard disks in 1000 units, but the actual system displays them in 1024 units
Character set
ASCII:
- The international standard, represented by numbers, is only 128 characters: basic English + meets
- Therefore, with 1 byte, you can express all character information such as 00001100, which can express up to 256 characters
GB2312 GBK GB18030
- GB18030, which has the largest Chinese character set, has 70,000
- So using two bytes to represent a man,
UTF
Utf-8 is an 8-bit unit, one byte. Byte stream (binary) UCS-2 encoding (hexadecimal) UTF-16 consists of two bytes in 16-bit units.
Why UTF-8
Utf-8 encoding rules:
1. The variable length coding, is decided by the first byte of the character encoding length 2. More than 127 yards points with multi-byte encoding, more byte contains the first byte and subsequent opening begin with a number 1 (several lengths of a few 1, so just read the first byte to know about this character how many bytes), followed by a zero. Subsequent bytes start with 10. 3. From right to right, each subsequent byte occupies 6 bits of the original code point, and the rest is placed in the leading byte. 4. The leading and subsequent bytes do not share any data, so UTF8 is self-synchronized. For example we see a byte with a value of 110... At the beginning, we know that this is the beginning byte of a 2-byte character.Copy the code