I. Paradigm and relations


Suppose we have an employee table named Employee with nine attributes: Id (Employee ID), NAME (employee name), mobile(phone number), ZIP (zip code), province(province), city(city), District (district), deptNo(department ID), deptName(department name), total table data are as follows:

id name mobile zip province city district deptNo deptName
101 Zhang SAN 13910000001, 13910000002, 100001 Beijing Beijing Haidian district, D1 Unit 1
101 Zhang SAN 13910000001, 13910000002, 100001 Beijing Beijing Haidian district, D2 Unit 2
102 Li si 13910000003 200001 Shanghai Shanghai Jing’an District D3 Unit 3
103 Cathy 13910000004 510001 Guangdong province, Guangzhou Baiyun district D4 Unit 4
103 Cathy 13910000004 510001 Guangdong province, Guangzhou Baiyun district D5 Unit 5

Since this employee list is not normalized, we will face the following problems.

  • Modification exception: There are two records for Zhang SAN in the above table, because he belongs to two departments. If we want to change John’s address, we must modify two lines of record. If one department gets the new address of Joe and updates it, while the other department does not, there will be two different addresses of Joe in the table at this time, resulting in inconsistent data
  • New exception:If a new employee is in the orientation phase of the company and has not been formally assigned to a department, ifdeptNoFields are not allowed to be empty, so we can’temployeeAdd the employee’s data to the table.
  • Delete exception:Suppose the company has abolished the D3 section, then in the deletiondeptNoIf the line is D3, li Si’s information will be deleted. Because he’s part of the D3 division.

First Normal Form (1NF)

Columns in a table can only contain atomic (non-divisible) values.

There are two mobile numbers stored in the mobile column, violating 1NF rule. To make the table satisfy 1NF, the data should be modified as follows:

id name mobile zip province city district deptNo deptName
101 Zhang SAN 13910000001 100001 Beijing Beijing Haidian district, D1 Unit 1
101 Zhang SAN 13910000002 100001 Beijing Beijing Haidian district, D1 Unit 1
101 Zhang SAN 13910000001 100001 Beijing Beijing Haidian district, D2 Unit 2
101 Zhang SAN 13910000002 100001 Beijing Beijing Haidian district, D2 Unit 2
102 Li si 13910000003 200001 Shanghai Shanghai Jing’an District D3 Unit 3
103 Cathy 13910000004 510001 Guangdong province, Guangzhou Baiyun district D4 Unit 4
103 Cathy 13910000004 510001 Guangdong province, Guangzhou Baiyun district D5 Unit 5

Second normal Form (2NF)

The second normal form satisfies both of the following conditions

  • Satisfy the first normal form
  • No partial dependencies

For example, a candidate key for the employee table is {ID, mobile, deptNo}, and deptName depends on deptNo, as well as name depends on ID and therefore is not 2NF. To satisfy the conditions of the second normal form, split the table into employee, DEPT, EMPLOYEe_DEPT, and Employee_Mobile tables. As follows:

Employee table employee

id name zip province city district
101 Zhang SAN 100001 Beijing Beijing Haidian district,
102 Li si 200001 Shanghai Shanghai Jing’an District
103 Cathy 510001 Guangdong province, Guangzhou Baiyun district

Department table dept

deptNo deptName
D1 Unit 1
D2 Unit 2
D3 Unit 3
D4 Unit 4
D5 Unit 5

Table Employee_DEPT Dept

id deptNo
101 D1
101 D2
102 D3
103 D4
104 D5

Employee phone table Employee_Mobile

id mobile
101 13910000001
101 13910000002
102 13910000003
103 13910000004

Third normal Form (3NF)

The third normal form satisfies both of the following conditions

  • Satisfy the second normal form
  • No passing dependencies

For example, the province, city, district of the employee table depends on ZIP, and zip depends on ID. In other words, the province, city, district pass depends on ID, violating 3NF rule. To satisfy the conditions of the third normal form, the table can be split into employee and ZIP tables as follows

employee

id name zip
101 Zhang SAN 100001
102 Li si 200001
103 Cathy 510001

Area table area

zip province city district
100001 Beijing Beijing Haidian district,
200001 Shanghai Shanghai Jing’an District
51000 Guangdong province, Guangzhou Baiyun district

In the design of relational database model, it is generally required to meet the requirements of the third normal form. If a table has a good primary foreign key design, it should be a table that satisfies 3NF. The benefits of canonicalization are to improve the efficiency of updating data by reducing data redundancy while ensuring data integrity. However, we should also guard against the problem of over-standardization in practical application. The more normalized you are, the more tables are divided and the more likely you are to use table join operations when querying data. If too many tables are joined, query performance will be affected. The key issue is to carefully balance data query and data update and specify the most appropriate level of normalization based on business requirements. Do not modify business requirements to follow strict normalization rules

Database one-to-one, one-to-many, many-to-many design


There are three types of correspondence between database entities: one-to-one, one-to-many, and many-to-many

Example of one-to-one relationship:

Each student has a unique ID number for each student profile

Example of a one-to-many relationship:

A student belongs to only one class, but there are more than one student in this class

Examples of many-to-many relationships:

A student can take multiple courses, a single course can have multiple students a single person can have multiple roles, a single role can have multiple people

Relationship between

  • Create table scores with the following structure

    • id
    • students
    • subjects
    • results

 

Consider: what information should be stored in the student column?

Answer: The data of the student column is not created here, but should be referenced from the student table, and the relationship is also a data; The paradigm requires that the student’s number be stored, not other information such as the student’s name

Similarly, the subject table is a relational column that references data from the subject table

The statement to create the table is as follows

Create table scores(id int primary key auto_increment, stuid int, subid int, score decimal(5,2));Copy the code

A foreign key

Consider: How to ensure the validity of relational column data? Any round number will do? A: The stuID must be in the id column of the student table. The stuID can be validated by a foreign key constraint

alter table scores add constraint stu_sco foreign key(stuid) references students(id);
Copy the code

If the value of stuID does not exist in the STUDENTS table when inserting or modifying data, an error will be reported. You can create constraints directly when creating the table

Create table scores(id int primary key auto_increment, stuid int, subid int, score decimal(5,2), foreign key(stuid) references students(id), foreign key(subid) references subjects(id) );Copy the code

Note: The addition of foreign keys can cost performance

Cascade operations for foreign keys

  • If the ID value already exists in scores when deleting data from the STUDENTS table, an exception will be thrown
  • Logical deletion is recommended to solve this problem
  • You can specify the cascade operation when you create the table, or modify the cascade operation for the foreign key after the table is created
  • grammar
alter table scores add constraint stu_sco foreign key(stuid) references students(id) on delete cascade;
Copy the code

The types of cascading operations include:

  • Restrict: Specifies the default value for throwing exceptions (common).
  • Cascade: If a record in the primary table is dropped, all records associated with it will be deleted. (Generally not used)
  • Set null: Set the foreign key to null.
  • No action: to do nothing (usually not used)

Two, even table query


  • Join query classifies as follows:

    • Table A inner join table B: Rows from table A that match table B appear in the result
    • Table A left JOIN table B: rows that match table A and B appear in the result, plus data unique to table A
    • Table A right join table B: rows that match table A and table B will appear in the result, plus data unique to table B. Non-corresponding data will be filled with NULL
  • Table names are recommended for queries or conditions. Syntax for column names

  • You can omit table name if the column names in multiple tables are different. Part of the

  • If the table name is too long, you can use ‘as short name’ or ‘short name’ after the table name to create a temporary short name for the table

practice

  • Query student’s name and average score
select students.sname,avg(scores.score)
from scores
inner join students on scores.stuid=students.id
group by students.sname;
Copy the code
  • Query boy’s name, total score
select students.sname,avg(scores.score)
from scores
inner join students on scores.stuid=students.id
where students.gender=1
group by students.sname;
Copy the code
  • Query subject name and average score
select subjects.stitle,avg(scores.score)
from scores
inner join subjects on scores.subid=subjects.id
group by subjects.stitle;
Copy the code
  • Query name, highest score, average score of undeleted subject
select subjects.stitle,avg(scores.score),max(scores.score)
from scores
inner join subjects on scores.subid=subjects.id
where subjects.isdelete=0
group by subjects.stitle;
Copy the code

Third, sub-query


  • Queries support nested use
  • Select * from student where student’s score is Chinese, math, English

What is a subquery

When a query is a condition of another query, the query is called a subquery (inner query)

When will it be used?

When the query requirements are complex and the results cannot be obtained through one-time query, multiple queries are required.

For example, to give a department name, you need to obtain information about all employees in the department

Analysis:

1. Determine the ID of the department

2. Only then can employees be identified by ID

The way to solve a problem is to break a complex problem into several simple problems

2. How to use it?

First, it is clear that a subquery is a normal query. When a query needs to be used as a subquery, it is wrapped in parentheses

3. Be careful

A subquery in can contain only one column

For example: Query who is in the finance department

Select name from emp where dept_id in (select id from dept where name = “dept_id “);

Select name from emp where dept_id in (select * from dept where name = “dept_id “);

Keyword: exists

Exists is followed by a subquery. The value is True if there is a result or False if there is no result. If True, the outer layer is executed; if False, the outer layer is not executed

How to use it?

select from emp where exists (select from emp where salary > 1000);

This command is executed only if there is a result in a subsequent query

Very tired today!

Date: 2021/11/25

Learning Reference Video: *www.bilibili.com/video/BV1i7…

Learning Reference document Reference section related video copywriting and courseware, only for personal learning and recording