“SQL will know will be” the essence of 10,000 words
This article is a summary of the book “SQL Must Know must Know”, to help readers quickly get started with SQL or MySQL, the main contents include:
- Database basics
- Operations related to library tables
- A method of retrieving data
This article brings chapters 8 through 13, and the previous chapters will be summarized in SQL Must-know chapters 1 through 7
Summary data
Aggregation function
An aggregate function is a function that runs on some rows and returns a value. Common aggregate functions are:
function | role |
---|---|
AVG() | Returns the average value of the column |
COUNT() | A function that returns a column |
MAX() | Returns the maximum value of a column |
MIN() | Returns the minimum value of the column |
SUM() | Returns the sum of the values of a column |
1. AVG() function
SELECT AVG(prod_price) AS avg_price -- Find the average value
FROM Products;
Copy the code
The above is the average of all rows, or you can specify a particular row:
SELECT AVG(prod_price) AS avg_price -- Find the average value
FROM Products
WHERE vend_id = 'DLLO1'; -- Specify a specific line
Copy the code
Note: AVG() ignores lines with a value of NULL
2. COUNT() function
The COUNT() function counts, which can be used to determine the number of functions in a table or the number of rows that meet a particular condition, in two cases:
- Count (*) : Whether a NULL value or a non-null value is counted
- Count (column) : The count of a specific column ignores the NULL value of that column in the table
SELECT COUNT(*) AS num_cust
FROM Customers;
num_cust
--------
5
SELECT COUNT(cust_email) AS num_cust
FROM Customers;
num_cust
--------
3
Copy the code
Note: If the column name is specified, the COUNT() function ignores rows with a null value for the specified column, but not if the COUNT() function uses an asterisk
MAX()/MIN() function
Returns the maximum or minimum value in the specified column
SELECT MAX(prod_price) AS MAX_price -- Find the maximum value
SELECT MAX(prod_price) AS MIN_price Let's minimize
FROM Products;
Copy the code
Note: The above two maxima functions automatically ignore rows with a value of NULL
4. SUM() function
Returns the sum of the specified column values (total)
SELECT SUM(quantity) AS items_ordered
FROM OrderItems
WHERE order_num = 20005;
Copy the code
The SUM() function can also be used to aggregate calculated values:
SELECT SUM(item_price * quantity) AS total_price Return the sum of all items
FROM OrderItems
WHERE order_num = 20005;
Copy the code
Note: The SUM() function automatically ignores rows with a value of NULL
Aggregate different values
Each of the above five aggregate functions can be used as follows:
- Evaluates ALL rows with or without ALL arguments (because ALL is the default behavior)
- Contains only different values. The DISTINCT parameter indicates that the calculation is performed after deduplication
Note: The ALL parameter does not need to be specified and is the default behavior
SELECT AVG(DISTINCT prod_price) AS avg_price -- Take the average value after de-weighting
FROM Products
WHERE vend_id = 'DLLO1'; -- Specify a specific line
Copy the code
Notes:
1. DISTINCT cannot be used in COUNT(*); If the column name is specified, DISTINCT can only be used with COUNT()
DISTINCT must use column names and cannot be used in calculations or expressions
3. DISTINCT used in MAX() and MIN() doesn’t make much sense because the maximum value is the same with or without deweighting
Combinatorial aggregation function
You can include multiple aggregation functions in the SELECT clause
SELECT
AVG(prod_price) AS avg_price -- Find the average value
,MAX(prod_price) AS max_price -- Find the maximum value
,MIN(prod_price) AS min_price Let's minimize
,COUNT(*) AS num_items -- The number of items
FROM Products;
Copy the code
Grouped data
Grouping uses two clauses:
- GROUP BY()
- HAVING()
Create a group
Groups are created using the GROUP BY clause of the SELECT clause.
SELECT
vend_id
,COUNT(*) AS num_prods
FROM Products
GROUP BY vend_id; -- Grouped columns
Copy the code
Common rules for using the GROUP BY clause:
- The GROUP BY clause can contain any number of columns, and groups can be nested
- Each column listed in the GROUP BY clause is either a retrieval column or a valid expression (but not an aggregation function)
- If you use an expression in SELECT, you must use the same expression in the GROUP BY clause instead of using an alias
- With the exception of the aggregate function, every column in the SELECT statement must be listed in the GROUP BY clause
- If a group contains rows with NULL, NULL is returned as a group; If multiple NULls occur in a column, they are grouped into a group
- The GROUP BY clause must follow the WHERE clause and precede the ORDER BY clause
- Relative positions can be used in the GROUP BY clause: GROUP BY 2, where 1 means GROUP BY the second column and then BY the first column
Packet filtering
Specify in the WHERE clause that rows, not groups, are filtered; In fact, WHERE species have no concept of grouping. SQL using HAVING to filter groups;
Note: WHERE filter rows, HAVING filter groups
SELECT
cust_id
,COUNT(*) AS orders
FROM Orders
GROUP BY cust_id
HAVING COUNT(*) > = 2; -- Filter grouping
Copy the code
WHERE and HAVING
- WHERE groups rows before data filtering and excludes rows that are not in the group statistics
- HAVING to filter data after grouping
SELECT
vend_id
,COUNT(*) AS num_prods
FROM Products
WHERE prod_price > = 4 -- Perform before grouping to find the data that meets the criteria
GROUP BY vend_id
HAVING COUNT(*) > = 2; -- Group and then execute, find the number of data is greater than 2
Copy the code
Grouping and sorting
ORDER BY and GROUP BY
ORDER BY | GROUP BY |
---|---|
Sort the resulting output | The rows are grouped, but the output may not be grouped in order |
Any column can be used (non-selected columns can also be used) | Only select columns or expression columns can be used, and each select column expression must be used |
Not necessarily | Must be used if columns are used with an aggregate function |
SELECT
order_num
,COUNT(*) AS items
FROM OrderItems
GROUP BY order_num
HAVING COUNT(*) > = 3
ORDER BY items, order_num; Group, filter, and sort the output
Copy the code
SELECT clause order
To summarize the order of the SELECT clause:
clause | instructions | Is it mandatory to use |
---|---|---|
SELECT | The column or expression to return | is |
FROM | A table from which data is retrieved | Used only when selecting data from a table |
WHERE | Row-level filter | no |
GROUP BY | Group show | Use only when calculating aggregation by group |
HAVING | Group level filter | no |
ORDER BY | Output sort order | no |
Using subqueries
Any SELECT statement is a query, and SQL also allows you to nest queries within a query.
SELECT cust_id Select cust_ID from order_num in the sub-query
FROM Orders
WHERE order_num IN (SELECT order_num Order_num = order_num
FROM OrderItems
WHERE prod_id = 'RGAN01');
Copy the code
Note: Subqueries are always processed from the inside out
SELECT Customers Finally, query Customers based on cust_id found
FROM cust_id IN(SELECT cust_id Select cust_ID from order_num in the sub-query
FROM Orders
WHERE order_num IN (SELECT order_num Order_num = order_num
FROM OrderItems
WHERE prod_id = 'RGAN01'));
Copy the code
Use subqueries as computed fields
Another way to use subqueries is to create computed fields
SELECT
cust_name
,cust_state
,(SELECT COUNT(*) Output the sub-query as a calculated field: count the number of cust_id each
FROM Orders
WHERE Orders.cust_id = Customers.cust_id) AS orders -- Orders.cust_id = Customers. Cust_id uses fully qualified column names to avoid ambiguity
FROM Customers
ORDER BY cust_name;
Copy the code
Join table
The most powerful function of SQL is the use of join tables in the process of data query.
Create a connection
A join is created by specifying the tables to join and how they are joined.
SELECT
vend_name,
prod_name,
prod_price
FROM Vendors, Products
WHERE Vendors.vend_id = Products.vend_id; -- Specify join conditions
Copy the code
If the above code does not have a WHERE clause to specify the join condition, the return is a Cartesian product, and the returned number is the rows in the first table multiplied by the rows in the second table.
Note: Returns a join of cartesian products, also called a cross join
Join inner join
The most widely used join is the equivalent join, also known as the inner join. The inner join code implements the above statement:
SELECT
vend_name,
prod_name,
prod_price
FROM Vendors
INNER JOIN Products - within link
ON Vendors.vend_id = Products.vend_id; -- Specify join conditions
Copy the code
Join multiple tables
SELECT
vend_name,
prod_name,
prod_price
FROM OrderItems, Products, Vendors
WHERE Products.vend_id = Vendors.vend_id Join multiple tables
AND OrderItems.prod_id = Products.prod_id
AND order_num = 20007;
Copy the code
We implement the result of the sub-query by joining:
- the subquery
SELECT Customers Finally, query Customers based on cust_id found
FROM cust_id IN(SELECT cust_id Select cust_ID from order_num in the sub-query
FROM Orders
WHERE order_num IN (SELECT order_num Order_num = order_num
FROM OrderItems
WHERE prod_id = 'RGAN01'));
- within link
SELECT
cust_name,
cust_contact
FROM Customers, Orders, OrderItems
WHERE Customers.cust_id = Orders.cust_id -- Multiple table join queries
AND OrderItems.order_num = Orders.order_num
AND prod_id = 'RGAN01'
Copy the code
Creating high-level joins
Using table aliases
You can alias a table in a SQL statement:
SELECT
cust_name,
cust_contact
FROM Customers AS C, Orders AS O, OrderItems AS OI -- Alias to make it more concise
WHERE C.cust_id = O.cust_id -- Multiple table join queries
AND OI.order_num = O.order_num
AND prod_id = 'RGAN01'
Copy the code
Use different types of joins
Introduce 3 different types of joins:
- Self join
- Natural join
- Outer join outer join
1, self join
- the subquery
SELECT cust_id,cust_name,cust_contact
FROM Customers
WHERE cust_name = (SELECT cust_name
FROM Customers
WHERE cust_contact = 'Jim Jones');
- within link
SELECT c1.cust_id, c2.cust_name, c1.cust_contact
FROM Customers AS c1, Customers AS c2 Use the same table twice
WHERE c1.cust_name = c2.cust_name
AND c2.cust_contact = 'Jim Jones';
Copy the code
The Customers table is used twice above, and must be distinguished by different aliases to avoid ambiguity.
2. Natural connection
Whenever a table is joined, at least one column should appear in more than one table (the joined column). Natural join exclusion occurs multiple times, with each column returning only once.
SELECT
C.*
,O.order_num
,O.order_date
,OI.prod_id
,OI.quantity
,OI.item_price
FROM Customers AS C, Orders AS O, OrderItems AS OI
WHERE C.cust_id = O.cust_id -- Multiple table join queries
AND OI.order_num = O.order_num
AND prod_id = 'RGAN01'
Copy the code
3. External connection
Sometimes we need to associate rows in one table with rows in another table, but sometimes we also need to include row records that are not associated with rows, as in the following scenario:
- Count the number of orders placed by each customer, including those that have not yet been placed
- List all products and order quantities, including products no one ordered
- Calculate the average sales size, including customers who have not placed orders so far
When a join contains rows that have no associated rows in related tables, it is called an outer join. For example, retrieve all customers including those who have no orders.
SELECT
C.cust_id
,O.order_num
FROM Customers AS C
LEFT OUTER JOIN Orders AS O Outside connection -
ON Customers.cust_id = Orders.cust_id
Copy the code
The above code represents a record that contains all the lines on the left; If it’s on the RIGHT, use the RIGHT OUTER. So outer joins actually take two forms, and they’re interchangeable
- The left outer joins
- Right connection
A special outer join, called a full outer join, retrieves all rows in both tables and associates those that can be associated. An all-out join contains unrelated rows of two tables
SELECT
C.cust_id
,O.order_num
FROM Customers AS C
FULL OUTER JOIN Orders AS O Outside connection -
ON Customers.cust_id = Orders.cust_id
Copy the code
Join with aggregation function
Retrieve all customers and all orders per customer:
SELECT
C.cust_id
,COUNT(O.order_num) AS num_ord Use the aggregate function to count orders
FROM Customers AS C
INNER JOIN Orders
ON C.cust_id = O.cust_id -- Associate two tables
GROUP BY Customers.cust_id Grouping -
Copy the code
Use joins and join conditions
To summarize the join and use points:
- Note the type of join used: inner join is usually used, sometimes outer join is effective
- Ensure that the correct join conditions are used, otherwise incorrect data will be returned
- Remember to provide join conditions, otherwise the cartesian product is returned
- You can have multiple tables in a join, and you can even use different join types for different tables. Be careful to test each join