The database

Database DB is a collection of data stored in large quantities and processed by computers that can be accessed efficiently.

The computer System used to manage databases is called a Database Management System (DBMS).

Databases and database management systems are often confused.

The types of DBMS

DBMS mainly through the data storage format (database type) to classify. There are five main types.

  1. Hierarchical database (Hierarchical Database, HDB)

One of the oldest databases, it presents data in a hierarchical structure (tree structure). Hierarchical database used to be the mainstream of database, but with the emergence and popularity of relational database, it has been rarely used.

  1. Relational database (Relational DatabaseAnd RDB)

Relational database is the most widely used database nowadays. The Structured Query Language (SQL) was developed in 1969 to manage data in a two-dimensional table composed of rows and columns.

This type of DBMS is called a Relational Database Management System (RDBMS). Common examples are OracIe Database, SQL Server, DB2, PostgreSQL and MySQL

  1. Object-oriented database (Object Oriented Database, OODB)

Borrowed from the concept of object-oriented language in programming languages. The collection of data and operations on it is managed on an object basis, hence the name. An object-oriented database is a database that holds these objects.

  1. XML database (XML Database, XMLDB)

XML databases can process large amounts of data in XML form at high speed

  1. Key value storage System (Key-Value Store, KVS)

Is a database that simply stores the combination of primary keys and values used in a query. This is like json in programming, or associative arrays or hashes. In recent years, with the application of key-value storage system to Google and other Web services that need to query a large amount of data at super high speed, it is gradually attracting people’s attention.

Structure of databases and tables

RDBMSS typically adopt a client/server type (C/S type) system architecture

A server is a program (software) that receives and processes requests from other programs, or a device (computer) on which such programs are installed. An RDBMS is a server that reads and returns, or changes, data from a database kept on a hard disk.

Correspondingly, the program (software) that makes the request to the server, or the device (computer) on which the program is installed, is called the client. Multiple clients can simultaneously read and write to the same database.

The two-dimensional tables used to manage data are simply called tables in a relational database and are stored in a database managed by an RDBMS

Table: A structured list of data of a particular type. The data in a table is the same type of data or list. A collection of data of a certain kind

The data returned based on the content of the SQL statement must also be in the form of a two-dimensional table (a characteristic of relational databases).

  • The columns of the tablecolumnThe vertical direction is calledfieldRepresents the data item saved in the table.
  • Table rowsrowThe horizontal direction is calledrecordRepresents a piece of data.

A relational database must read and write data in behavioral units

Schema: Information about the layout and features of databases and tables. A schema defines how data is stored in a table, including what kind of data is stored, how the data is decomposed, and how the parts of the information are named. Both databases and tables have schemas. It can also be translated as “architecture”.

The data type

The data type of the column is immediately to the right of the column name specified when the table is created. All columns must specify a data type.

Data types represent the types of data, such as INTEGER, CHAR, and DATE.

Each column can only store data of the same data type as the column.

  • About CHAR and VARCHAR types

Columns of type CHAR store fixed-length strings. Fixed – length strings are listed as fixed – length strings stored in columns that do not reach the maximum length, supplemented by half-corner Spaces.

Columns of type VARCHAR store variable-length strings. If the stored string does not reach the maximum length specified by the VARCHAR, the original string is still stored. More flexible than char and has less storage space.

VARCHAR2 is used in OracIe (VARCHAR is also available, but not recommended)

The DATE type in OracIe also contains minutes and seconds.

  • The constraint

Constraints are the ability to restrict or append conditions to the data stored in a column.

For example, the NOT NULL constraint and PRIMARY KEY constraint.

Each row in the table should have a column (or columns) that uniquely identifies itself and sets that column (or columns) as the primary key. In principle, every table has a primary key.

Values in primary key columns are not allowed to be modified or updated.

A key is a combination of columns to use when specifying specific data.

SQL

The SQL profile

SQL is a language designed to manipulate databases.

The International Organization for Standardization (ISO) has developed the corresponding standard for SQL. The SQL based on this standard is called standard SQL. The statements written with standard SQL can be used in various RDBMS.

Standard SQL is managed by the ANSI Standards Committee, and thus is called ANSI SQL.

ANSI stands for American National Standards Institute

SQL uses keywords, table names, and column names to form SQL statements to describe the contents of operations.

Keywords are predefined words that have special meanings or usage methods and are used to form SQL statements. It is a reserved word in the SQL language.

reserved keywordReserved keywords

SQL statement classification: DDL, DML, DCL

SQL statements can be divided into three categories, depending on the type of instruction an RDBMS executes:

  • DDL(Data Definition LanguageData definition language (DML) is used to create or delete objects such as databases and tables used to store data. The DDL contains the following instructions.

CREATE: Creates objects such as databases and tables; DROP: Deletes objects such as databases and tables. ALTER: Alters the structure of objects such as databases and tables

  • DML(Data Manipulation Language, data Manipulation language) to query or change records in a table. DML includes the following types.

SELECT: queries the data in the table. INSERT: INSERT new data into a table; UPDATE: Updates data in a table. DELETE: deletes data from a table

DQL (Data Query Language — SELECT) and DML (INSERT, UPDATE, DELETE)

  • DCL(Data Control Language, data control language) is used to confirm or cancel changes to data in the database, and to set whether users of the RDBMS have permission to manipulate objects in the database. Includes the following types.

COMMIT: Confirm changes to data in the database; ROLLBACK: Cancels changes to data in the database. GRANT: GRANT operation permission to the user. REVOKE: Revokes a user’s operation rights;

SQL statement rules

Basic rules for SQL statements:

  1. SQL statements start with a semicolon (;) At the end. Most (not all) DBMSS can execute individual statements without a semicolon.

The only exception seems to be SQL Server, where statements can be separated by Spaces or carriage returns without semicolons, but semicolons are still recommended.

Omit the statement delimiter semicolon; Will be removed in a future release. That is, the semicolon () cannot be omitted.

  1. SQL statements are case insensitive. Keywords, table names, column names, and so on are case insensitive. Specific values are certainly case-sensitive, but table names, column names, and so on are (mostly) case-sensitive through DBMS Settings.
  2. Constants are written in a fixed way. Strings and date constants are enclosed in single quotation marks (‘), and numbers are simply written.
  3. Words need to be separated by half-space or line feeds.

ANSI (The American National Standards Institute) or ISO (the International Organization for Standardization) revise SQL standards every few years (revise syntax, add functionality). SQL benchmarked against these standards is standard SQL.

But different DATABASE management systems each have special SQL statements that can only be used in their particular RDBMS

About keywords, unconventional identifiers

An identifier is the name of a database object. For servers, databases, and database objects such as tables, views, columns, indexes, triggers, constraints, stored procedures, anything can have an identifier, that is, a name that identifies itself.

Identifiers generally contain only letters, digits, and underscores (_). This is true of the rules for identifiers in most programming languages.

Known in databases as Regular identifiers, or rule identifiers.

However, if the identifier contains keywords or Spaces, Delimited identifiers must be used.

For example, creating a table named TABLE or SELECT will generate a syntax error because it is a reserved key.

In SQL Server, delimited identifiers need to be enclosed in double quotation marks or square brackets [];

In PostgreSQL, delimited identifiers are enclosed in double quotation marks;

In MySQL/MariaDB, delimit identifiers are enclosed with ‘(backquotes, symbols in the upper left corner of the keyboard and tilde ~).

Note: The actual test failed to create a database with restricted identifiers in the PSQL command line of PostgreSQL. But it can be created successfully in pgAdmin4.

-- PostgreSQL | SQL Server 
create table "select a"(
	id integer
)

-- SQL Server 
create table [select a](
	id integer
)

-- MySQL/MariaDB
create table `select a`(
	id integer
)
Copy the code

Common DDL operations

Create databases and tables

Log in to PostgreSQL using PSQL and create database shop

CREATE DATABASE shop;
Copy the code

Deleting the database is also simple

DROP DATABASE shop;
Copy the code

CREATE TABLE creates a TABLE as follows:

CREATE TABLE <The name of the table>
(
    <The column name1> <The data type> <The column constraints>.<The column name2> <The data type> <The column constraints>.<Table of the constraint1>.<Table of the constraint2>,...). ;Copy the code

The column name and data type for each column must be specified when the table is created. Constraints can be specified directly after the column type or set at the end of the statement.

The names of databases, tables, and columns can contain only letters, digits, and underscores (_), and cannot start with a number

Create a Product table as follows:

CREATE TABLE Product
(
    product_id   CHAR(4)        NOT NULL,
    product_name VARCHAR(100)   NOT NULL,
    product_type VARCHAR(32)    NOT NULL,
    sale_price   INTEGER,
    purchase_price  INTEGER,
    regist_date     DATE.PRIMARY KEY (product_id)
);
Copy the code

In SQL Server, MySQL/MariaDB, the integer type integer is usually represented by an int. PostgreSQL usually uses INTEGER. All of these DBMSS support the use of integer and int

For special historical reasons, it is best to specify utF8MB4 encoding when creating tables in MySQL/Mariadb. This is especially true when using UTF-8 encoded Chinese characters or emoticons. Otherwise, an error will be reported when Chinese content is inserted.

CREATE TABLE `Product` (
  `product_id` char(4) COLLATE utf8mb4_unicode_ci NOT NULL,
  `product_name` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
  `product_type` varchar(32) COLLATE utf8mb4_unicode_ci NOT NULL,
  `sale_price` int(11) DEFAULT NULL,
  `purchase_price` int(11) DEFAULT NULL,
  `regist_date` date DEFAULT NULL.PRIMARY KEY (`product_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Copy the code

The default CHARSET=latin1.

The comparison is done by modifying the MySQL/Mariadb configuration file to set the character encoding to UTF8MB4 so that the encoding is not specified on every table construction clause (including the column of the statement).

Latin1 is an alias for ISO-8859-1, or latin-1 in some environments.

Iso-8859-1 encoding is single-byte encoding, backward compatible with ASCII, its encoding range is 0x00-0xFF, 0x00-0x7F is completely consistent with ASCII, 0x80-0x9F is the control character, 0xA0-0xFF is the character symbol.

Latin1 is an extension of the encoding of text in The European region.

Iso-8859-1 is an international standard for 8-bit (single-byte) coding.

Delete table

DROP TABLE <The name of the table>;
Copy the code

Update tables and fields

Updating table definitions

1. Add columns

ALTER TABLE <The name of the table> ADD COLUMN <The definition of the column>;
Copy the code

OracIe and SQL Server do not write COLUMN:

ALTER TABLE <The name of the table> ADD <The definition of the column>;
Copy the code

ALTER TABLE < TABLE name > ADD (< column definition >,< column definition >); . For example, SQL Server does not need to add parentheses. Each column is separated by commas.

For example, add a column product_name_pinyin to the Product table:

-- PostgreSQL | MySQL | DB2
ALTER TABLE Product ADD COLUMN product_name_pinyin VARCHAR(100);

-- SQL Server
ALTER TABLE Product ADD product_name_pinyin VARCHAR(100);

-- Oracle
ALTER TABLE Product ADD (product_name_pinyin VARCHAR(100));
Copy the code

2. Delete the column

ALTER TABLE <The name of the table> DROP COLUMN <The column name>;
Copy the code

ALTER TABLE < TABLE name > DROP < COLUMN name >; . When deleting multiple columns, use parentheses.

Delete column product_name_pinyin as follows

-- SQL Server | DB2 | PostgreSQL | MySQL
ALTER TABLE Product DROP COLUMN product_name_pinyin;

-- Oracle
ALTER TABLE Product DROP (product_name_pinyin);
Copy the code

3. Changing the field name

To change the column name, use the following statement

-- PostgreSQL
ALTER TABLE Product RENAME purchase_price TO purchase_price1;

-- SQL Server
EXEC sp_rename 'Table name.[old field name]'.'Field new name' , 'COLUMN';

MySQL/MariaDB uses change to change the name of a column. The new name must specify the type of the column. Otherwise, an error is reported, which means that a column is rebuilt. The column keyword is optional
-- alter table tab_name change [column] old_name new_name DATETYPE [first|after col_name]
alter table Product change column purchase_price purchase_price1 int(11);
Copy the code

In MySQL/MariaDB, use change to modify the name of a column. The new name must specify the type of the column. Otherwise, an error message is displayed, which is equivalent to rebuilding a column. If you want to implement renaming only, you need to assign the front part of the old field’s type, constraint, and so on to the new field name

4. Changing the field Type

-- PostgreSQL
ALTER TABLE Product ALTER COLUMN regist_date TYPE timestamp;  -- Equivalent to timestamp without time zone

- | Oracle MySQL/MariaDB modify used to modify the column definition, is not only a type
alter table Product modify regist_date timestamp; 

MySQL/MariaDB modify column
alter table Product modify column regist_date date;

-- SQL Server
alter table Product alter column regist_date datetime null;

MySQL/MariaDB can also use change to change the field type. Not recommended because it is a rebuilt column
alter table Product change [column] regist_date regist_date timestamp;
Copy the code

In actual testing, the default constraint also changed when changing the field type using MySQL/MariaDB. Other DBMSS are fully tested, but when actually changing field types, be sure to test them in advance in the test database.

There is also a **ALTER COLUMN** clause, but it can only be used to set and remove default values.

For MySQL/MariaDB, use MODIFY or CHANGE to MODIFY the table structure. The operation is to create a new table that meets the requirements, and then insert data. If the table has a large amount of data, many indexes, and insufficient memory, such modification may take several hours or even several days. In the case of large tables, do not run the MODIFY or CHANGE command to MODIFY fields. One unofficially recommended method is to move.frm files. Specific check MySQL in the MODIFY | CHANGE | the difference between the ALTER COLUMN

5. Modify the name of the table

If you find that the table name is incorrectly written, or you want to change the table name, you can use the following statement

-- PostgreSQL | Oracle | MySQL
ALTER TABLE Product RENAME TO Product1;

-- SQL Server
sp_rename 'Product'.'Product1';

-- MySQL | DB2
RENAME TABLE Product to Product1;

- change tables MySQL statement, [to | as] can be omitted. alter table tbl_name rename[to|as] new_tbl_name
Copy the code

RENAME does not exist in standard SQL

DML operations: Insert data

INSERT INTO <The name of the table> VALUES(value1, value2 ,...) ;- or
INSERT INTO <The name of the table>(col1,col2,...) VALUES(value1, value2 ,...) ;Copy the code

Insert data into Product as follows:

--SQL Server PostgreSQL
-- DML: inserts data
BEGIN TRANSACTION;  MySQL = START TRANSACTION; Oracle DB2 does not need to add this row, just delete it
INSERT INTO Product VALUES ('0001'.'T shirt' ,'clothes'.1000.500.'2009-09-20');
INSERT INTO Product VALUES ('0002'.'hole punch'.'Office supplies'.500.320.'2009-09-11');
INSERT INTO Product VALUES ('0003'.'Sport T-shirt'.'clothes'.4000.2800.NULL);
INSERT INTO Product VALUES ('0004'.'chopper'.'Kitchen appliances'.3000.2800.'2009-09-20');
INSERT INTO Product VALUES ('0005'.Pressure cooker.'Kitchen appliances'.6800.5000.'2009-01-15');
INSERT INTO Product VALUES ('0006'.'the fork.'Kitchen appliances'.500.NULL.'2009-09-20'),
                            ('0007'.'Chopping board'.'Kitchen appliances'.880.790.'2008-04-28'),
                            ('0008'.'Ballpoint pen'.'Office supplies'.100.NULL.'2009-11-11');
COMMIT;
Copy the code

Note: When inserting a large number of rows, it is recommended to use the multi-row insertion method, which is much faster, and will be explained later