This is the ninth day of my participation in the August More text Challenge. For details, see: August More Text Challenge

Collation: indicates the Collation rule

Character Set or Character Encoding: Character Encoding/Character Encoding Set: Character Encoding Set

Clear interpretation of coded character sets and collation rules

A Character Set is a Set of symbols and encodings. A Collation is a set of rules for comparing characters in a character set.

The following is an example of a fictional character set to illustrate character sets and collation rules:

Suppose we have an alphabet with four letters: A, B, A, B. We give each letter A number: A = 0, B = 1, A = 2, B = 3. The letter A is A symbol, the number 0 is the code for A, and the combination of all four letters and their codes is A character set.

Suppose we want to compare two string values, A and B. The easiest way to do this is to look at the encoding: 0 for A, 1 for B. Since 0 is less than 1, we say A is less than B. So what we just did is we applied the collation to our character set. This collation is a set of rules (in the current example there is only one rule) : “comparison encoding.” We call this simplest collation a binary collation.

But what if you want to say that lowercase letters and uppercase letters are equivalent?

Then there are at least two rules :(1) treat lowercase letters a and b as equivalent to a and b; (2) Then compare the codes. This is known as case-insensitive or case-insensitive collation. It’s a little more complicated than binary collation.

In real life, most character sets have many characters or symbols: not just A and B, especially in languages and symbols worldwide, and many special symbols, punctuation marks.

Also in real life, most collation rules have many rules, including not only case-sensitive, but also whether to distinguish between accents (” accent “is a mark attached to a character, such as the German O), and multi-character mappings (such as the rule of O = OE in one of two German collations).

Processing of character set and collation rules in database system

Therefore, the database system needs to realize the support and processing of different character sets and different sorting rules. To cope with a variety of different regional contexts and character text content.

The RDBMS needs to handle:

Supports multiple character sets for storing character text.
Supports comparing character text using multiple collation rules.
Support for mixing character text with different character sets or collations in the same server, the same database, or even the same table.
Supports specifications for specifying character sets and collation rules at the database system, database, table, and column levels.

And other features and operations related to character encoding and collation.

RDBMSS usually use the default character set and collation rules, which rarely change in practice. Different database systems have different support for character set encoding and collation, and you need to know about the official support.

However, you should have some understanding of the character sets and collations available in an RDBMS, how to change the default Settings, and how to use them in databases, tables, columns, and queries.

Different character sets and collation rules have different effects. Such as character text comparison, manipulation and function behavior, indexing and physical data storage processing, etc.

Character Sets and Collations in General

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Let you clearly understand the character set and collation | database tutorial 7: understanding in the database character set and collation

Clear interpretation of coded character sets and collation rules

Processing of character set and collation rules in database system

Let you clearly understand the character set and collation | database tutorial 7: understanding in the database character set and collation

Clear interpretation of coded character sets and collation rules

Processing of character set and collation rules in database system

Related Posts

Organize excellent API interface design and related excellent interface management, online document generation tools

SpringSession series -sessionId parsing and Cookie read and write policies

Start from scratch to achieve the placement of the game (seven) – to achieve background management system (5) parameter verification