Referencing Parent query inside Child query

问题

Can anyone explain this query when subquery is referencing parent. How does SQL think about this

Second highest salary of employee:

select max(e1.sal),e1.deptno 
from s_emp e1 
where sal < (select max(sal) from s_emp e2 where e2.deptno = e1.deptno) 
group by e1.deptno;

I tested it and it works.

回答1:

First remove group by and aggegation and consider this query:

select e1.sal, e1.deptno 
from s_emp e1 
where e1.sal < (select max(sal) from s_emp e2 where e2.deptno = e1.deptno)

It returns all the rows of the table except the ones with the maximum sal in their deptno.
Why?
Because each row's sal is compared to the deptno's max salary and must be less.
The subquery in the WHERE clause is executed once for every row of the table:

select max(e2.sal) from s_emp e2 where e2.deptno = e1.deptno

and for every row it returns the maximum sal for the current row's deptno.
So the result is all sals that are less than this max sal of the current row's deptno.
Now if you add group by deptno and aggregation you get for each deptno the max sal of the returned rows which is the 2nd highest sal for each deptno since all the top ones are already excluded.

回答2:

This is called a correlated subquery, because the result of the subquery is potentially different for every row of the outer query.

When MySQL runs a query, you can think of it like a foreach loop over a collection. For example in PHP syntax:

foreach (s_emp as e1) {
  ...
}

It must run the subquery for each row of the outer query, before it can evaluate the < comparison. This will become quite expensive as the number of rows increases. If the table has N rows, it will run the subquery N times, even if there are only a few distinct values for deptno! MySQL is not smart enough to remember the result after having run the subquery for the same deptno value.

Instead, you can get the result this way, which will calculate the max(sal) for all deptnos, and keep those results in a temporary table.

select max(e1.sal), e1.deptno
from s_emp e1
join (select deptno, max(sal) as max_sal from s_emp group by deptno) as e2
  on e1.deptno = e2.deptno
where e1.sal < e2.max_sal
group by e1.deptno

The purpose of this query appears to be to return the second highest salary per department, right?

Here's another solution using window functions in MySQL 8.0:

select deptno, sal
from (
  select deptno, sal, dense_rank() over (partition by deptno order by sal desc) as dr
  from s_emp
) as e1
where dr = 2

来源：https://stackoverflow.com/questions/59865236/referencing-parent-query-inside-child-query

标签

mysql

correlated-subquery