The NF, or Normal Form, was summarized by e.F.Codd (Edgar. F.codd, father of relational databases). Paradigm is the basis of relational database theory, and it is also the rule and guiding method to be followed in the process of designing database structure.
The textbook definition is that a normal form is “a set of relational patterns that conform to a certain level, indicating the degree of rationalization of the relationships among the attributes within a relationship”, commonly understood as the level of design criteria that the table structure of a data table conforms to.
At present, there are eight known paradigms, namely, 1NF, 2NF, 3NF, BCNF, 4NF, 5NF, DKNF and 6NF. Usually, database design only meets the first normal form (1NF), second normal form (2NF), third normal form (3NF). Relational databases are usually designed with BCFN at most in mind.
Design that conforms to the higher normal form must conform to the lower normal form.
Focus on the understanding of database three paradigm, about BC paradigm, the fourth/fifth paradigm to do an understanding.
In some places, BC normal form is divided into the fourth normal form, so there are 1NF, 2NF, 3NF, BCNF or 4NF, 5NF, 6NF.
In general, there are six paradigms for relational databases.
Database three paradigm
1. First Normal Form (1NF) : Ensure that each column remains atomic.
The first paradigm is the most basic paradigm. A database table satisfies the first normal form if all field values in the table are non-decomposable atomic values.
The emphasis is on the indivisibility of the meaning of the columns, each of which is an indivisible basic data item.
1NF is defined as: every attribute in the relationship conforming to 1NF is non-divisible. (You can think of a “relationship” as a table with data, and a “relational schema” is the table structure of this data table).
Proper adherence to the first paradigm depends on the actual requirements of the system. For example, some database systems need to use the “address” attribute, originally directly designed “address” attribute into a database table field on the line. However, if the system often accesses the “city” part of the “address” attribute, it is necessary to re-split the “address” attribute into multiple parts such as province, city and detailed address for storage, so that it will be very convenient to operate or classify a part of the address. This design satisfies the first paradigm of databases, as shown in the following table.
The first normal form is indivisible.
In any relational database, the first normal form (1NF) is the basic requirement for the relational schema, and a database that does not meet the first normal form (1NF) is not a relational database.
1NF is the most basic requirement of all relational databases, as long as the data table already exists in the RDBMS, must conform to 1NF.
2. Second normal Form (2NF) : Ensure that every column in a table is associated with a primary key.
The second normal form is a step up from the first. The second normal form is to ensure that every column in a database table is related to a primary key, not just a part of the primary key (mainly for union primary keys), and that all data is related to the same kind of data by fully related to the primary key.
That is, in a database table, a table can only save one kind of data, can not save a variety of data in the same database table.
On the basis of 1NF, 2NF eliminates the partial function dependence of non-primary attributes on codes.
Code: An attribute or a combination of attributes in a relationship that distinguishes each tuple (tuple can be understood as each record in a table, i.e. each row)
- Function dependence: it can be understood as: in a table, if the value of attribute (or attribute group)X is determined, the value of attribute Y must be determined, so it can be said that function Y depends on X, which can be written as X → Y.
That is, there are no two records in the table that have the same value on the X attribute (or set of attributes) but different values on the Y attribute. This is where the term “function dependence” comes from, similar to the function y = f(x) relation, where the value of x is certain, the value of y must be certain.
-
Complete function dependence: In a table, if X → Y is not true for any proper subset of X (if the attribute group X contains more than one attribute), then Y is called fully functional dependent on X, denoted as:
-
Partial function dependenceIf Y depends on X, but Y does not depend entirely on X, we can say that Y partially depends on X, denoted as:
-
Transfer function dependence: If Z function depends on Y and Y function depends on X (assuming Y is not included in X and X does not depend on Y), then Z transfer function depends on X, denoted as:
-
Code: let K be an attribute or attribute group in a table. If all attributes except K are completely functionally dependent on K, K is called candidate code and referred to as code for short. That is, if K is determined, then the values of all attributes of the table except K are also determined, then K is the code (that is, the attribute or attribute group can uniquely identify the current row/tuple). If there are multiple candidate codes in a relationship, one is usually selected as the primary code
-
Primary attributes: Attributes contained in any code are called primary attributes. Attributes that are not included in any candidate code are called non-primary attributes or non-code attributes.
3. Third Normal Form (3NF) : Ensure that each column is directly related to the primary key column, not indirectly
The third normal form requires that each column in the data table is directly related to the primary key, not indirectly.
The third normal form corresponds to the primary foreign key relationship of a table, with indirectly related column data as a new table and foreign keys associated to the primary table.
On the basis of 2NF, 3NF eliminates the dependence of non-primary attributes on the transfer function of codes.
Summary of database three paradigms
The database design in line with the requirements of 3NF basically solves the problems of excessive data redundancy, abnormal insertion, abnormal modification and abnormal deletion. In practice, 2NF or 1NF is often done for performance or for scaling purposes.
The corresponding requirements of the three paradigms are:
- The first normal form is not separable. The columns of a table are not separable (atomicity of columns).
- Second normal form complete function dependence. Non-primary key columns are completely dependent on primary keys.
- The third normal form eliminates transitive dependencies for non-primary attributes. Pass dependent properties to create new tables, associated by foreign keys.
A more detailed or rigorous explanation of the three paradigms can be found in How Zhihu understands the common design paradigms of relational databases. The best answer (the BCFN example is a classic one) and most of this article is based on it. At the same time (especially the marking part) reference database design three paradigms in the introduction, feel the introduction of the table from the perspective of the better understanding of the three paradigms to achieve the pattern.
Other database paradigms
BCNF(Boyce-Codd Normal Form)
BC paradigm
BCNF is generally regarded as the modified third normal form, which further eliminates the partial function dependence and transitive dependence of main attributes on codes on the basis of third normal form.
The BC paradigm removes transitive dependencies for primary attributes.
4NF(Fourth normal Form)
4NF is to restrict non-trivial and non-functional multi-valued dependencies between properties of the relational schema. If the relationship mode is 4NF, it must be BCNF.
In other words, when non-primary attributes in a table are independent of each other (3NF), these non-primary attributes should not have multiple values. Multiple values violate the fourth normal form.
Trivial dependence: If X->Y, and Y is a subset of X (for any relational pattern, trivial function dependence must be true), it is trivial function dependence.
Nontrivial dependence: NONtrivial function dependence if X->Y, but Y is not a subset of X.
5NF(fifth normal form)
The fifth normal form is also called the perfect normal form.
There can be no nontrivial join dependencies that do not adhere to key constraints. A table that conforms to 4NF and in which each join dependency is contained by a candidate key complies with the fifth dependency. The fifth normal form is the final normal form, which eliminates connection dependencies in 4NF.
This part mainly refers to the difference between trivial dependency, non-trivial dependency, complete dependency, partial dependency, transitive dependency, direct dependency and database design paradigm
The fourth normal form can also understand writing, and the fifth normal form is a little too deep to understand join dependencies.
In-depth understanding of each paradigm requires in-depth research and inquiry to understand clearly.