问题
My problem is that I want to retrieve both a list of measurements along with a moving average of those measurements. I can do that with this SQL statement (postgresql interval syntax):
SELECT time, value,
(
SELECT AVG(t2.value)
FROM measurements t2
WHERE t2.time BETWEEN t1.time - interval '5 days' AND t1.time
) moving_average
FROM measurements t1
ORDER BY t1.time;
I want to have the SQLAlchemy code to produce a similar statement to this effect. I currently have this Python code:
moving_average_days = # configureable value, defaulting to 5
t1 = Measurements.alias('t1')
t2 = Measurements.alias('t2')
query = select([t1.c.time, t1.c.value, select([func.avg(t2.c.value)], t2.c.time.between(t1.c.time - datetime.timedelta(moving_average_days), t1.c.time))],
t1.c.time > (datetime.datetime.utcnow() - datetime.timedelta(ndays))). \
order_by(Measurements.c.time)
That however, generates this SQL:
SELECT t1.time, t1.value, avg_1
FROM measurements AS t1,
(
SELECT avg(t2.value) AS avg_1
FROM measurements AS t2
WHERE t2.time BETWEEN t1.time - %(time_1)s AND t1.time
)
WHERE t1.time > %(time_2)s
ORDER BY t1.time;
That SQL has the subquery as part of the FROM clause where it cannot have scalar access to the column values of the top-level values, i.e. it causes PostgreSQL to spit out this error:
ERROR: subquery in FROM cannot refer to other relations of same query level
LINE 6: WHERE t2.time BETWEEN t1.time - interval '5 days' AN...
What I would thus like to know is: how do I get SQLAlchemy to move the subquery to the SELECT clause?
Alternatively another way to get a moving average (without performing a query for each (time,value) pair) would be an option.
回答1:
Right, apparently what I needed was the use of a so-called scalar select. With the use of those I get this python code, which actually works as I want it to (generates the equivalent SQL to that of the first in my question which was my goal):
moving_average_days = # configurable value, defaulting to 5
ndays = # configurable value, defaulting to 90
t1 = Measurements.alias('t1') ######
t2 = Measurements.alias('t2')
query = select([t1.c.time, t1.c.value,
select([func.avg(t2.c.value)],
t2.c.time.between(t1.c.time - datetime.timedelta(moving_average_days), t1.c.time)).label('moving_average')],
t1.c.time > (datetime.datetime.utcnow() - datetime.timedelta(ndays))). \
order_by(t1.c.time)
This gives this SQL:
SELECT t1.time, t1.value,
(
SELECT avg(t2.value) AS avg_1
FROM measurements AS t2
WHERE t2.time BETWEEN t1.time - :time_1 AND t1.time
) AS moving_average
FROM measurements AS t1
WHERE t1.time > :time_2 ORDER BY t1.time;
来源:https://stackoverflow.com/questions/3764358/how-to-use-subqueries-in-sqlalchemy-to-produce-a-moving-average