I. Paradigm and relations
Suppose we have an employee table named Employee with nine attributes: Id (Employee ID), NAME (employee name), mobile(phone number), ZIP (zip code), province(province), city(city), District (district), deptNo(department ID), deptName(department name), total table data are as follows:
id | name | mobile | zip | province | city | district | deptNo | deptName |
---|---|---|---|---|---|---|---|---|
101 | Zhang SAN | 13910000001, 13910000002, | 100001 | Beijing | Beijing | Haidian district, | D1 | Unit 1 |
101 | Zhang SAN | 13910000001, 13910000002, | 100001 | Beijing | Beijing | Haidian district, | D2 | Unit 2 |
102 | Li si | 13910000003 | 200001 | Shanghai | Shanghai | Jing’an District | D3 | Unit 3 |
103 | Cathy | 13910000004 | 510001 | Guangdong province, | Guangzhou | Baiyun district | D4 | Unit 4 |
103 | Cathy | 13910000004 | 510001 | Guangdong province, | Guangzhou | Baiyun district | D5 | Unit 5 |
Since this employee list is not normalized, we will face the following problems.
- Modification exception: There are two records for Zhang SAN in the above table, because he belongs to two departments. If we want to change John’s address, we must modify two lines of record. If one department gets the new address of Joe and updates it, while the other department does not, there will be two different addresses of Joe in the table at this time, resulting in inconsistent data
- New exception:If a new employee is in the orientation phase of the company and has not been formally assigned to a department, if
deptNo
Fields are not allowed to be empty, so we can’temployee
Add the employee’s data to the table.- Delete exception:Suppose the company has abolished the D3 section, then in the deletion
deptNo
If the line is D3, li Si’s information will be deleted. Because he’s part of the D3 division.
First Normal Form (1NF)
Columns in a table can only contain atomic (non-divisible) values.
There are two mobile numbers stored in the mobile column, violating 1NF rule. To make the table satisfy 1NF, the data should be modified as follows:
id | name | mobile | zip | province | city | district | deptNo | deptName |
---|---|---|---|---|---|---|---|---|
101 | Zhang SAN | 13910000001 | 100001 | Beijing | Beijing | Haidian district, | D1 | Unit 1 |
101 | Zhang SAN | 13910000002 | 100001 | Beijing | Beijing | Haidian district, | D1 | Unit 1 |
101 | Zhang SAN | 13910000001 | 100001 | Beijing | Beijing | Haidian district, | D2 | Unit 2 |
101 | Zhang SAN | 13910000002 | 100001 | Beijing | Beijing | Haidian district, | D2 | Unit 2 |
102 | Li si | 13910000003 | 200001 | Shanghai | Shanghai | Jing’an District | D3 | Unit 3 |
103 | Cathy | 13910000004 | 510001 | Guangdong province, | Guangzhou | Baiyun district | D4 | Unit 4 |
103 | Cathy | 13910000004 | 510001 | Guangdong province, | Guangzhou | Baiyun district | D5 | Unit 5 |
Second normal Form (2NF)
The second normal form satisfies both of the following conditions
- Satisfy the first normal form
- No partial dependencies
For example, a candidate key for the employee table is {ID, mobile, deptNo}, and deptName depends on deptNo, as well as name depends on ID and therefore is not 2NF. To satisfy the conditions of the second normal form, split the table into employee, DEPT, EMPLOYEe_DEPT, and Employee_Mobile tables. As follows:
Employee table employee
id | name | zip | province | city | district |
---|---|---|---|---|---|
101 | Zhang SAN | 100001 | Beijing | Beijing | Haidian district, |
102 | Li si | 200001 | Shanghai | Shanghai | Jing’an District |
103 | Cathy | 510001 | Guangdong province, | Guangzhou | Baiyun district |
Department table dept
deptNo | deptName |
---|---|
D1 | Unit 1 |
D2 | Unit 2 |
D3 | Unit 3 |
D4 | Unit 4 |
D5 | Unit 5 |
Table Employee_DEPT Dept
id | deptNo |
---|---|
101 | D1 |
101 | D2 |
102 | D3 |
103 | D4 |
104 | D5 |
Employee phone table Employee_Mobile
id | mobile |
---|---|
101 | 13910000001 |
101 | 13910000002 |
102 | 13910000003 |
103 | 13910000004 |
Third normal Form (3NF)
The third normal form satisfies both of the following conditions
- Satisfy the second normal form
- No passing dependencies
For example, the province, city, district of the employee table depends on ZIP, and zip depends on ID. In other words, the province, city, district pass depends on ID, violating 3NF rule. To satisfy the conditions of the third normal form, the table can be split into employee and ZIP tables as follows
employee
id | name | zip |
---|---|---|
101 | Zhang SAN | 100001 |
102 | Li si | 200001 |
103 | Cathy | 510001 |
Area table area
zip | province | city | district |
---|---|---|---|
100001 | Beijing | Beijing | Haidian district, |
200001 | Shanghai | Shanghai | Jing’an District |
51000 | Guangdong province, | Guangzhou | Baiyun district |
In the design of relational database model, it is generally required to meet the requirements of the third normal form. If a table has a good primary foreign key design, it should be a table that satisfies 3NF. The benefits of canonicalization are to improve the efficiency of updating data by reducing data redundancy while ensuring data integrity. However, we should also guard against the problem of over-standardization in practical application. The more normalized you are, the more tables are divided and the more likely you are to use table join operations when querying data. If too many tables are joined, query performance will be affected. The key issue is to carefully balance data query and data update and specify the most appropriate level of normalization based on business requirements. Do not modify business requirements to follow strict normalization rules
Database one-to-one, one-to-many, many-to-many design
There are three types of correspondence between database entities: one-to-one, one-to-many, and many-to-many
Example of one-to-one relationship:
Each student has a unique ID number for each student profile
Example of a one-to-many relationship:
A student belongs to only one class, but there are more than one student in this class
Examples of many-to-many relationships:
A student can take multiple courses, a single course can have multiple students a single person can have multiple roles, a single role can have multiple people
Relationship between
-
Create table scores with the following structure
- id
- students
- subjects
- results
Consider: what information should be stored in the student column?
Answer: The data of the student column is not created here, but should be referenced from the student table, and the relationship is also a data; The paradigm requires that the student’s number be stored, not other information such as the student’s name
Similarly, the subject table is a relational column that references data from the subject table
The statement to create the table is as follows
Create table scores(id int primary key auto_increment, stuid int, subid int, score decimal(5,2));Copy the code
A foreign key
Consider: How to ensure the validity of relational column data? Any round number will do? A: The stuID must be in the id column of the student table. The stuID can be validated by a foreign key constraint
alter table scores add constraint stu_sco foreign key(stuid) references students(id);
Copy the code
If the value of stuID does not exist in the STUDENTS table when inserting or modifying data, an error will be reported. You can create constraints directly when creating the table
Create table scores(id int primary key auto_increment, stuid int, subid int, score decimal(5,2), foreign key(stuid) references students(id), foreign key(subid) references subjects(id) );Copy the code
Note: The addition of foreign keys can cost performance
Cascade operations for foreign keys
- If the ID value already exists in scores when deleting data from the STUDENTS table, an exception will be thrown
- Logical deletion is recommended to solve this problem
- You can specify the cascade operation when you create the table, or modify the cascade operation for the foreign key after the table is created
- grammar
alter table scores add constraint stu_sco foreign key(stuid) references students(id) on delete cascade;
Copy the code
The types of cascading operations include:
- Restrict: Specifies the default value for throwing exceptions (common).
- Cascade: If a record in the primary table is dropped, all records associated with it will be deleted. (Generally not used)
- Set null: Set the foreign key to null.
- No action: to do nothing (usually not used)
Two, even table query
-
Join query classifies as follows:
- Table A inner join table B: Rows from table A that match table B appear in the result
- Table A left JOIN table B: rows that match table A and B appear in the result, plus data unique to table A
- Table A right join table B: rows that match table A and table B will appear in the result, plus data unique to table B. Non-corresponding data will be filled with NULL
-
Table names are recommended for queries or conditions. Syntax for column names
-
You can omit table name if the column names in multiple tables are different. Part of the
-
If the table name is too long, you can use ‘as short name’ or ‘short name’ after the table name to create a temporary short name for the table
practice
- Query student’s name and average score
select students.sname,avg(scores.score)
from scores
inner join students on scores.stuid=students.id
group by students.sname;
Copy the code
- Query boy’s name, total score
select students.sname,avg(scores.score)
from scores
inner join students on scores.stuid=students.id
where students.gender=1
group by students.sname;
Copy the code
- Query subject name and average score
select subjects.stitle,avg(scores.score)
from scores
inner join subjects on scores.subid=subjects.id
group by subjects.stitle;
Copy the code
- Query name, highest score, average score of undeleted subject
select subjects.stitle,avg(scores.score),max(scores.score)
from scores
inner join subjects on scores.subid=subjects.id
where subjects.isdelete=0
group by subjects.stitle;
Copy the code
Third, sub-query
- Queries support nested use
- Select * from student where student’s score is Chinese, math, English
What is a subquery
When a query is a condition of another query, the query is called a subquery (inner query)
When will it be used?
When the query requirements are complex and the results cannot be obtained through one-time query, multiple queries are required.
For example, to give a department name, you need to obtain information about all employees in the department
Analysis:
1. Determine the ID of the department
2. Only then can employees be identified by ID
The way to solve a problem is to break a complex problem into several simple problems
2. How to use it?
First, it is clear that a subquery is a normal query. When a query needs to be used as a subquery, it is wrapped in parentheses
3. Be careful
A subquery in can contain only one column
For example: Query who is in the finance department
Select name from emp where dept_id in (select id from dept where name = “dept_id “);
Select name from emp where dept_id in (select * from dept where name = “dept_id “);
Keyword: exists
Exists is followed by a subquery. The value is True if there is a result or False if there is no result. If True, the outer layer is executed; if False, the outer layer is not executed
How to use it?
select from emp where exists (select from emp where salary > 1000);
This command is executed only if there is a result in a subsequent query
Very tired today!
Date: 2021/11/25
Learning Reference Video: *www.bilibili.com/video/BV1i7…
Learning Reference document Reference section related video copywriting and courseware, only for personal learning and recording