问题
Let's say there is a schema:
|date|value|
DBMS is SQLite.
I want to get N groups and calculate AVG(value) for each of them.
Sample:
2020-01-01 10:00|2.0
2020-01-01 11:00|2.0
2020-01-01 12:00|3.0
2020-01-01 13:00|10.0
2020-01-01 14:00|2.0
2020-01-01 15:00|3.0
2020-01-01 16:00|11.0
2020-01-01 17:00|2.0
2020-01-01 18:00|3.0
Result (N=3):
2020-01-01 11:00|7.0/3
2020-01-01 14:00|15.0/3
2020-01-01 17:00|16.0/3
I need to use a windowing function, like NTILE, but it seems NTILE is not usable after GROUP BY. It can create buckets, but then how can I use these buckets for aggregation?
SELECT
/*AVG(*/value/*)*/,
NTILE (3) OVER (ORDER BY date) bucket
FROM
test
/*GROUP BY bucket*/
/*GROUP BY NTILE (3) OVER (ORDER BY date) bucket*/
Also dropped the test data and this query into DBFiddle.
回答1:
You can use NTILE()
window function to create the groups and aggregate:
SELECT
DATETIME(MIN(DATE), ((STRFTIME('%s', MAX(DATE)) - STRFTIME('%s', MIN(DATE))) / 2) || ' second') date,
ROUND(AVG(value), 2) avg_value
FROM (
SELECT *, NTILE(3) OVER (ORDER BY date) grp
FROM test
)
GROUP BY grp;
To change the number of rows in each bucket, you must change the number 3 inside the parentheses of NTILE()
.
See the demo.
Results:
| date | avg_value |
| ------------------- | --------- |
| 2020-01-01 11:00:00 | 2.33 |
| 2020-01-01 14:00:00 | 5 |
| 2020-01-01 17:00:00 | 5.33 |
回答2:
I need to use a windowing function, like NTILE, but it seems NTILE is not usable after GROUP BY. It can create buckets, but then how can I use these buckets for aggregation?
You first use NTILE
to assign bucket numbers in a subquery, then group by it in an outer query.
Using sub-query
SELECT bucket
, AVG(value) AS avg_value
FROM ( SELECT value
, NTILE(3) OVER ( ORDER BY date ) AS bucket
FROM test
) x
GROUP BY bucket
ORDER BY bucket
Using WITH
clause
WITH x AS (
SELECT date
, value
, NTILE(3) OVER ( ORDER BY date ) AS bucket
FROM test
)
SELECT bucket
, COUNT(*) AS bucket_size
, MIN(date) AS from_date
, MAX(date) AS to_date
, MIN(value) AS min_value
, AVG(value) AS avg_value
, MAX(value) AS max_value
, SUM(value) AS sum_value
FROM x
GROUP BY bucket
ORDER BY bucket
来源:https://stackoverflow.com/questions/62833358/sql-grouping-to-have-exact-rows