My SQL is a bit rusty and I\'m having quite a bit of difficulty with this problem. Suppose I have a table with a Timestamp column and a Number column. The goal is to return
Something like
with st
as (SELECT to_timestamp( '2012-06-06 12:40:00', 'yyyy-mm-dd hh24:mi:ss') +
numtodsinterval((level-1)*5, 'MINUTE') start_time,
to_timestamp( '2012-06-06 12:40:00', 'yyyy-mm-dd hh24:mi:ss') +
numtodsinterval(level*5, 'MINUTE') end_time
from dual
connect by level <= 10)
SELECT st.start_time, avg( yt.value )
FROM your_table yt,
st
WHERE yt.time between st.start_time and st.end_time
should work. Rather than generating 10 intervals and hard-coding the lowest interval, you can enhance the query to derive the starting point and the number of rows from the MIN(time)
and MAX(time)
in the table.
Justin's and Sebas' answers can be extended with a LEFT JOIN to eliminate "gaps", which is often desirable.
If that's not necessary, as an alternative, we can go old school Oracle DATE arithmetic...
SELECT TRUNC(t.time)+FLOOR(TO_CHAR(t.time,'sssss')/300)*300/86400 AS time
, AVG(t.value) AS avg_value
FROM foo t
WHERE t.time IS NOT NULL
GROUP BY TRUNC(t.time)+FLOOR(TO_CHAR(t.time,'sssss')/300)*300/86400
ORDER BY TRUNC(t.time)+FLOOR(TO_CHAR(t.time,'sssss')/300)*300/86400
Let's unpack that a bit. We can separate the date and time components, using TRUNC to get the date portion, and using a TO_CHAR to return the number of seconds since midnight. We know 5 minutes is 300 seconds, and we know there are 86400 seconds in a day. So we can divide the number of seconds by 300, and take the FLOOR of that (just the integer portion), which rounds us down to the nearest 5 minute boundary. We multiply that back (by 300), to get seconds again, and then divide that by the number of seconds in a day (86400), and we can add that back to the (truncated) date portion.
Painful, yes. But blazingly fast.
NOTE: this returns the rounded time value as a DATE
, this could be cast back to a timestamp if needed, but for even 5 minute boundaries, a DATE
has sufficient resolution.
As a benefit of this approach, for a large table, we can boost performance of the query by adding a covering index for this query:
CREATE INDEX foo_FBX1
ON foo (TRUNC(t.time)+FLOOR(TO_CHAR(t.time,'sssss')/300)*300/86400,value);
ADDENDUM:
MiMo provided an answer for SQL Server, suggesting that it would be adaptable for Oracle. Here is an adaptation of that approach in Oracle. Note that Oracle does not provide equivalents for the DATEDIFF and DATEADD functions. Oracle uses simple arithmetic instead.
SELECT TO_DATE('00010101','YYYYMMDD')+FLOOR((t.time-TO_DATE('00010101','YYYYMMDD'))*288)/288
AS time
, AVG(t.value) AS avg_value
FROM foo t
WHERE t.time IS NOT NULL
GROUP BY TO_DATE('00010101','YYYYMMDD')+FLOOR((t.time-TO_DATE('00010101','YYYYMMDD'))*288)/288
ORDER BY TO_DATE('00010101','YYYYMMDD')+FLOOR((t.time-TO_DATE('00010101','YYYYMMDD'))*288)/288
The choice of Jan 1, 0001 A.D. as a base date is arbitrary, but I didn't want to mess with negative values, and figuring out if FLOOR would be right, or whether we would need to use CEIL with negative numbers. (The magic number 288 is a result of 1440 minutes in a day divided by 5). In this case, we are taking the fractional day, multiplying by 1440 and dividing by 5, and taking integer portion of that, and then putting it back to fractional days.
It is tempting to pull that "base date" from a PL/SQL package, or get it from a subquery, but doing either of those might prevent this expression from being deterministic. And we'd really like to keep open the option of creating a function based index.
My preference is to avoid the need for including a "base date" in the calculation.
CREATE TABLE tt (time TIMESTAMP, value NUMBER);
INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.40.00.000000000 PM', 2);
INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.41.35.000000000 PM', 3);
INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.43.22.000000000 PM', 4);
INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.47.55.000000000 PM', 5);
INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.52.00.000000000 PM', 2);
INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.54.59.000000000 PM', 3);
INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.56.01.000000000 PM', 4);
WITH tmin AS (
SELECT MIN(time) t FROM tt
), tmax AS (
SELECT MAX(time) t FROM tt
)
SELECT ranges.inf, ranges.sup, AVG(tt.value)
FROM
(
SELECT
5*(level-1)*(1/24/60) + tmin.t as inf,
5*(level)*(1/24/60) + tmin.t as sup
FROM tmin, tmax
CONNECT BY (5*(level-1)*(1/24/60) + tmin.t) < tmax.t
) ranges JOIN tt ON tt.time BETWEEN ranges.inf AND ranges.sup
GROUP BY ranges.inf, ranges.sup
ORDER BY ranges.inf
fiddle: http://sqlfiddle.com/#!4/9e314/11
edit: beated by Justin, as usual... :-)
This is a solution for SQL Server:
declare @startDate datetime = '2000-01-01T00:00:00'
declare @interval int = 5
select
DATEADD(mi, (DATEDIFF(mi, @startDate, time)/@interval)*@interval, @startDate),
AVG(value)
from
table
group by
DATEDIFF(mi, @startDate, time)/@interval
order by
DATEDIFF(mi, @startDate, time)/@interval
The start date is arbitrary. The idea is that you compute the number of minutes from the start date, and then group by this number divided by the interval.
It should be adaptable to Oracle easily using the equivalent for DATEADD
and DATEDIFF