@[TOC]

Introduction:

  • In daily work, we need to comply with the specifications, such as sign in calories, approval process, etc., although these specifications have certain constraints such as feeling, it is very necessary and so on, so that we can guarantee the correctness and rigour, but in some cases, instead of constraint can bring efficiency decline, such as a direct operation tasks, but requires approval to perform.

Database design paradigm

  1. When we design a relational database model, we need to define the rationalization degree of the relationship between various attributes within the relationship, which has different levels of specification requirements, which are called normal form (NF). It can be understood that the design structure of a table needs to meet certain design criteria
  2. At present, there are altogether 6 paradigms of relational database, which are: 1NF (first paradigm), 2NF (second paradigm), 3NF (third paradigm), BCNF (Bas-Coad paradigm), 4NF (fourth paradigm) and 5NF (fifth paradigm, also called perfect paradigm).
  3. The higher the order of database paradigm design, the lower the redundancy, and the higher order paradigm must meet the requirements of the lower order paradigm, such as 2NF must meet 1NF, 3NF must meet 2NF, and so on
  4. In general, data tables should be designed to meet 3NF as far as possible. But not always, sometimes in order to improve certain query functions, we need to break the paradigm rules, that is, de-normalize.

Keys in a data table

The paradigm definition uses primary and candidate keys (because primary and candidate keys uniquely identify tuples), and keys in a database consist of one or more attributes. I summarize the following definitions for the keys and attributes commonly used in data tables:

  • Superkeys: A set of attributes that uniquely identify a tuple is called a superkey
  • Candidate key: If the superkey does not contain redundant attributes, the Shanghai superkey is a candidate key
  • Primary key: The user can select one of the candidates as the primary key
  • Foreign key: If a set of attributes in table R1 is not the primary key of R1 but the primary key of another table R2, then this set is the foreign key of table R1
  • Primary property: Contains the primary property of any of the candidate keys
  • Non-primary: As opposed to primary, an attribute that is not included in any candidate key

We’ve used NBA players and TEAMS before. Here I can define the players table as containing the player number, name, ID number, age, and team number. The team table contains the team number, manager, and team location.

The player list:

The name of the field
Player number id
The name name
Id number card
age age
The team number team_id

The team list:

The name of the field
The team number id
coach coach
address address

For a player table, a superkey is any combination of player numbers or ID numbers, such as (player number) (player number, name) (ID number, age), etc.

The candidate key is the smallest superkey, and in the case of the player table, the candidate key is either (player number) or (ID number).

The primary key is chosen by ourselves, i.e. one of the candidate keys, such as (player number).

The foreign key is the team number in the player table.

In the Player table, the primary attribute is (player number) (ID number) and the other attributes (name) (age) (team number) are non-primary attributes.

From 1 nf to 3 nf

  1. 1NF means that any attribute in the database is atomic and non-separable. It makes sense that when we design a field, for field X, we can’t split field X into fields X-1 and x-2. In fact, any DBMS will meet the requirements of the First paradigm
  2. All non-primary attributes in the table that 2NF refers to are completely dependent on the candidate key of the table. A complete dependency is different from a partial dependency, in that you cannot rely on only some of the attributes of the candidate key, but must rely on all of them

For example, we have designed a player table player_game, which contains attributes such as player number, name, age, match number, match time and venue. The candidate key and primary key are both (player number, match number).

Relationship (Player number, match number) → (name, age, match time, venue, score),Copy the code

The above relationship indicates that the combination of player and game numbers determines the player’s name, age, time of play, place of play, and score for that game. However, this data table does not meet the second normal form, because there is also the following correspondence between the fields in the data table:

(Player NUMBER) → (Name, age) → (Match number) → (Match time, venue)Copy the code

That is, a field in the candidate key determines a non-primary attribute. You can also understand that for non-primary attributes, it’s not entirely dependent on candidate keys. What kind of problems does that create?

1. Data redundancy: If a player can play m games, then the player's name and age are repeated M-1 times. A game can also be played by n players, and the time and place of the game are repeated n-1 times. 2. Insert exception: If we want to add a new match, but we haven't decided who will be playing, we can't insert it. 3. Deletion exception: If I want to delete a player id, I will delete the match information at the same time if THE match table is not saved separately. 4. Update exception: If we adjust the time of a race, then all The Times of that race in the table need to be adjusted, otherwise a race time will be different.Copy the code

To avoid the above situation, we can design the player match table as the following three tables.

The player table contains attributes such as player number, name, and age; The game table contains the game number, game time, game venue and other attributes; Player_game The player_game table contains attributes such as player numbers, match numbers, and scores.Copy the code

In this way, each data table conforms to the second normal form, thus avoiding the occurrence of anomalies. In a way, 2NF is an atomic upgrade to 1NF. 1NF tells us that field attributes need to be atomic, while 2NF tells us that a table is an independent object, that is, a table has only one meaning.

  1. 3NF does not pass dependent candidate keys for any non-primary attributes while satisfying 2NF. That is, there cannot be A case where non-primary attribute A depends on non-primary attribute B, and non-primary attribute B depends on candidate keys.

For example, let’s use the Player table, which contains attributes like player number, name, team name, and team manager. Now, let’s draw the dependencies between the attributes, as shown below:

You can see that the player number determines the team name, while the team name determines the team manager. The manager of a non-primary team will pass depending on the player number, thus not meeting the requirements of 3NF.

If you want to meet the requirements of 3NF, you need to disassemble the data table as follows:

The attributes of the player table include the player number, name, and team name; The attributes of the team table include the team name and the team manager.Copy the code

Conclusion: 1NF needs to ensure that every attribute in the table remains atomic; 2NF needs to ensure that non-primary attributes in the table are completely dependent on candidate keys; 3NF needs to ensure that there are no transitive dependencies between non-primary attributes and candidate keys in the table.

BCNF (Bass Normal Form)

If the relational schema of the data table meets the requirements of 3NF, is there no problem? Warehouse_keeper table

In this data table, a warehouse has only one administrator, and an administrator manages only one warehouse. Let’s first tease out the dependencies between these attributes. The warehouse name determines the manager, the manager determines the warehouse name, and the set of attributes (warehouse name, item name) can determine the quantity attribute. In this way, we can find the table’s candidate keys are (administrator, item name) and (warehouse, item name), and then we can choose one of the candidate keys as the primary key, such as (warehouse, item name). In this case, the primary property is the property contained in any of the candidate keys, namely the warehouse name, administrator, and item name. The non-primary property is the quantity property.

First, every attribute of the data table is atomic and meets the requirements of 1NF. Secondly, the non-primary attribute “quantity” of the data table is completely dependent on the candidate key, (warehouse name, item name) determines the quantity, (administrator, item name) determines the quantity, therefore, the data table meets the requirements of 2NF; Finally, non-primary attributes in the data table do not pass dependent on candidate keys. Therefore, it meets the requirements of 3NF. Existing problems:

  1. Add a warehouse, but don’t store anything yet. According to the requirement of data table entity integrity, the primary key cannot have empty value, so there will be insert exception;
  2. If the warehouse changes administrators, we may modify multiple records in the data table;
  3. If all items in the warehouse are sold short, the warehouse name and the corresponding administrator name will be deleted accordingly.

Exceptions to insert, update, and delete data may exist even if the requirements of 3NF are met

Solution:

  • First, we need to confirm the cause of the exception: the primary attribute repository name is partially dependent on the candidate key (administrator, item name), which could lead to the above exception. On the basis of 3NF, BCNF, also called Bas-Coad paradigm, is proposed, which removes the partial dependence or transitive dependence of the primary attribute on the candidate key.

According to the requirements of BCNF, we need to split the warehouse_keeper table into the following:

Warehouse table :(warehouse name, administrator) inventory table :(warehouse name, item name, quantity) so that there is no partial or transfer dependence of the primary attribute on the candidate key, the above data table is designed to conform to the BCNF.Copy the code

Personal blog: blog.yanxiaolu.cn /