SQL: grouping to have exact rows

拈花ヽ惹草 提交于 2021-02-11 12:42:37

问题


Let's say there is a schema:

|date|value|

DBMS is SQLite.

I want to get N groups and calculate AVG(value) for each of them.

Sample:

2020-01-01 10:00|2.0
2020-01-01 11:00|2.0
2020-01-01 12:00|3.0
2020-01-01 13:00|10.0
2020-01-01 14:00|2.0
2020-01-01 15:00|3.0
2020-01-01 16:00|11.0
2020-01-01 17:00|2.0
2020-01-01 18:00|3.0

Result (N=3):

2020-01-01 11:00|7.0/3
2020-01-01 14:00|15.0/3
2020-01-01 17:00|16.0/3

I need to use a windowing function, like NTILE, but it seems NTILE is not usable after GROUP BY. It can create buckets, but then how can I use these buckets for aggregation?

SELECT
   /*AVG(*/value/*)*/,
   NTILE (3) OVER (ORDER BY date) bucket
FROM
   test
/*GROUP BY bucket*/
/*GROUP BY NTILE (3) OVER (ORDER BY date) bucket*/

Also dropped the test data and this query into DBFiddle.


回答1:


You can use NTILE() window function to create the groups and aggregate:

SELECT 
  DATETIME(MIN(DATE), ((STRFTIME('%s', MAX(DATE)) - STRFTIME('%s', MIN(DATE))) / 2) || ' second') date, 
  ROUND(AVG(value), 2) avg_value
FROM (
  SELECT *, NTILE(3) OVER (ORDER BY date) grp
  FROM test
) 
GROUP BY grp;

To change the number of rows in each bucket, you must change the number 3 inside the parentheses of NTILE().

See the demo.
Results:

| date                | avg_value |
| ------------------- | --------- |
| 2020-01-01 11:00:00 | 2.33      |
| 2020-01-01 14:00:00 | 5         |
| 2020-01-01 17:00:00 | 5.33      |



回答2:


I need to use a windowing function, like NTILE, but it seems NTILE is not usable after GROUP BY. It can create buckets, but then how can I use these buckets for aggregation?

You first use NTILE to assign bucket numbers in a subquery, then group by it in an outer query.

Using sub-query

SELECT bucket
     , AVG(value) AS avg_value
  FROM ( SELECT value
              , NTILE(3) OVER ( ORDER BY date ) AS bucket
           FROM test
       ) x
 GROUP BY bucket
 ORDER BY bucket

Using WITH clause

WITH x AS (
   SELECT date
        , value
        , NTILE(3) OVER ( ORDER BY date ) AS bucket
     FROM test
)
SELECT bucket
     , COUNT(*) AS bucket_size
     , MIN(date) AS from_date
     , MAX(date) AS to_date
     , MIN(value) AS min_value
     , AVG(value) AS avg_value
     , MAX(value) AS max_value
     , SUM(value) AS sum_value
  FROM x
 GROUP BY bucket
 ORDER BY bucket


来源:https://stackoverflow.com/questions/62833358/sql-grouping-to-have-exact-rows

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!