This series has been in the works for a long time, but every time I try to start it, something else gets in the way. The series will be roughly divided into four series:
- algorithm
- The database
- shell
- concurrent
preface
Let’s get started this week and see if we can do one of the daily questions, which will be posted on GitHub. We have redefined our understanding of the SQL model. Here we will brush through the questions to improve our understanding of SQL. To enrich our SQL model in practice, we should focus on the idea. Try to submit SQL directly in LeetCode, blind write.
Today’s title
You have the Second Highest Salary
Write an SQL query to get the second highest Salary in the Employee table.
Id | Salary |
---|---|
1 | 100 |
2 | 200 |
3 | 300 |
For example, the Employee table above, the SQL query should return 200 as the second highest salary. If no second highest salary exists, the query should return NULL.
SecondHighestSalary |
---|
200 |
The first way to interpret it
And the way to think about it, if you’re doing this programmatically, you can’t just sort it out and go to the element with index 1 in the array. That’s the idea. What about SQL, but SQL doesn’t have subscripts, but we do have paging functions.
So my first version submission looks like this:
SELECT DISTINCT Salary AS SecondHighestSalary FROM Employee ORDER BY Salary DESC LIMIT 1, OFFSET 1
Copy the code
But the answer is wrong:
I output a null, right? If there is only one item, skip this item and select NULL instead of NULL. So why is that? I forgot to configure environment variables when I installed MySQL, but I don’t want to configure environment variables yet, but I can use graphical tools to bring out the command line:
1.
Select * from student where student does not exist;
We in THE SQL query model and sub-query learning will be SQL each execution process will output a virtual table, the final is also a virtual table, here we review:
So we query student for a nonexistent record and return an empty set, which is expected and conforms to our SQL query model. What if we use the query as an expression, as follows:
SELECT (SELECT NAME FROM Student where id = 3) name;
Copy the code
Output is NULL:
In oracle, you can write:
SELECT (SELECT NAME FROM Student where id = 3) name FROM Dual
Copy the code
SELECT (SELECT NAME FROM Student where id = 3) FROM Student where id = 3); That’s called a scalar quantum query,
With this in mind, we can rewrite our SQL as:
SELECT
(SELECT distinct salary AS SecondHighestSalary FROM Employee ORDER BY salary DESC LIMIT 1 OFFSET 1) AS
SecondHighestSalary
Copy the code
Why is it ok to add DISTINCT when it is not added? The problem comes from our wrong SQL query model. In the above SQL query model, Distinct is executed after a SELECT. In fact, it should be executed before a SELECT. Distinct de-duplicates the result set, and then SELECT fetches the result set. So our SQL query model above can be modified to look like this:
Mysql > select * from SQL where salary is the highest in the table, then select the second. Here we are giving the equivalent oracle:
SELECT (SELECT DISTINCT(salary) AS SecondHighestSalary FROM (SELECT salary,ROWNUM AS rowno FROM Employee WHERE ROWNUM <= 2) table_alias WHERE table_alias.rowno > 1) AS SecondHighestSalary FROM DUALCopy the code
The second way to interpret it
The first solution is relatively more attached to the program language like Java, we can actually solve this problem in another way, with the help of the powerful statistical characteristics of SQL, that is, find the largest salary first, and then from the table is less than the largest salary, which is the second largest. This approach is more SQL oriented and cross-database, so let’s write it directly:
SELECT max(salary) AS SecondHighestSalary FROM Employee WHERE salary < (SELECT max(salary) FROM Employee)
Copy the code
But this solution is not as fast as the first one.
Introduction to scalar quantum query [1]
All references to standard quantum queries here are from Amazon SQL Reference, linked at the end of this article.
A scalar quantum query is a regular SELECT query in parentheses that returns only one value: a row with one column. The query will be executed and the return value will be used in the external query. If the subquery returns zero rows, the value of the subquery expression is NULL. If it returns multiple rows, Amazon Redshift returns an error. Subqueries can refer to variables in the parent query, which will act as constants in any call to the subquery.
A scalar quantum query is an invalid expression in the following cases:
- As the default value for the expression
- In the GROUP BY and HAVING clauses
Calculate the average price paid per sale for the whole of 2008 using the sub-query, and then the external query uses the values in the output to compare the average price per sale for each quarter:
select qtr, avg(pricepaid) as avg_saleprice_per_qtr,
Copy the code
select qtr, avg(pricepaid) as avg_saleprice_per_qtr, (select avg(pricepaid) from sales join date on sales.dateid=date.dateid where year = 2008) as avg_saleprice_yearly from sales join date on sales.dateid=date.dateid where year = 2008 group by qtr order by qtr; QTR | avg_saleprice_per_qtr | avg_saleprice_yearly -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - | | 647.64 1 642.28 | 2 | 3 | | 636.79 642.28 646.86 642.28 4 | | 638.26 642.28 (4 rows)Copy the code
The resources
[1] Standard quantum query docs.aws.amazon.com/zh_cn/redsh…