问题
I have a table that looks something like this:
DataTable
+------------+------------+------------+
| Date | DailyData1 | DailyData2 |
+------------+------------+------------+
| 2012-01-23 | 146.30 | 212.45 |
| 2012-01-20 | 554.62 | 539.11 |
| 2012-01-19 | 710.69 | 536.35 |
+------------+------------+------------+
I'm trying to create a view (call it AggregateView
) that will, for each date and for each data column, show a few different aggregates. For example, select * from AggregateView where Date = '2012-01-23'
might give:
+------------+--------------+----------------+--------------+----------------+
| Date | Data1_MTDAvg | Data1_20DayAvg | Data2_MTDAvg | Data2_20DayAvg |
+------------+--------------+----------------+--------------+----------------+
| 2012-01-23 | 697.71 | 566.34 | 601.37 | 192.13 |
+------------+--------------+----------------+--------------+----------------+
where Data1_MTDAvg
shows avg(DailyData1)
for each date in January prior to Jan 23, and Data1_20DayAvg
shows the same but for the prior 20 dates in the table. I'm no SQL ninja, but I was thinking that the best way to do this would be via subqueries. The MTD average is easy:
select t1.Date, (select avg(t2.DailyData1)
from DataTable t2
where t2.Date <= t1.Date
and month(t2.Date) = month(t1.Date)
and year(t2.Date) = year(t1.Date)) Data1_MTDAvg
from DataTable t1;
But I'm getting hung up on the 20-day average due to the need to limit the number of results returned. Note that the dates in the table are irregular, so I can't use a date interval; I need the last twenty records in the table, rather than just all records over the last twenty days. The only solution I've found is to use a nested subquery to first limit the records selected, and then take the average.
Alone, the subquery works for individual hardcoded dates:
select avg(t2.DailyData1) Data1_20DayAvg
from (select DailyData1
from DataTable
where Date <= '2012-01-23'
order by Date desc
limit 0,20) t2;
But trying to embed this as part of the greater query blows up:
select t1.Date, (select avg(t2.DailyData1) Data1_20DayAvg
from (select DailyData1
from DataTable
where Date <= t1.Date
order by Date desc
limit 0,20) t2)
from DataTable t1;
ERROR 1054 (42S22): Unknown column 't1.Date' in 'where clause'
From searching around I get the impression that you can't use correlated subqueries as part of a from
clause, which I think is where the problem is here. The other issue is that I'm not sure if MySQL will accept a view definition containing a from
clause in a subquery. Is there a way to limit the data in my aggregate selection without resorting to subqueries, in order to work around these two issues?
回答1:
No, you can't use correalted subqueries in the FROM
clause. But you can use them in the ON
conditions:
SELECT AVG(d.DailyData1) Data1_20DayAvg
--- other aggregate stuff on d (Datatable)
FROM
( SELECT '2012-01-23' AS DateChecked
) AS dd
JOIN
DataTable AS d
ON
d.Date <= dd.DateChecked
AND
d.Date >= COALESCE(
( SELECT DailyData1
FROM DataTable AS last20
WHERE Date <= dd.DateChecked
AND (other conditions for last20)
ORDER BY Date DESC
LIMIT 1 OFFSET 19
), '1001-01-01' )
WHERE (other conditions for d Datatable)
Similar, for many dates:
SELECT dd.DateChecked
, AVG(d.DailyData1) Data1_20DayAvg
--- other aggregate stuff on d (Datatable)
FROM
( SELECT DISTINCT Date AS DateChecked
FROM DataTable
) AS dd
JOIN
DataTable AS d
ON
d.Date <= dd.DateChecked
AND
d.Date >= COALESCE(
( SELECT DailyData1
FROM DataTable AS last20
WHERE Date <= dd.DateChecked
AND (other conditions for last20)
ORDER BY Date DESC
LIMIT 1 OFFSET 19
), '1001-01-01' )
WHERE (other conditions for d Datatable)
GROUP BY
dd.DateChecked
Both queries assume that Datatable.Date
has a UNIQUE
constraint.
来源:https://stackoverflow.com/questions/9068777/mysql-alternatives-to-nested-subqueries-when-limiting-aggregate-data-in-a-corr