Group by a variable range in mysql / MariaDB

纵饮孤独 提交于 2019-12-25 06:09:08

问题


I have a history log and i want to summarize some entries. The interval should be 5 seconds

For example: I have a list

date_start          | date_end           | count | somestring
2015-09-15 12:04:09 | 2015-09-15 12:04:09| 1     | xyz
2015-09-15 12:05:09 | 2015-09-15 12:05:09| 1     | xyz
2015-09-15 12:05:10 | 2015-09-15 12:05:10| 1     | xyz
2015-09-15 12:05:11 | 2015-09-15 12:05:11| 1     | xyz
2015-09-15 12:06:09 | 2015-09-15 12:06:09| 1     | xyz

I want now to have an output like

date_start          | date_end           | count | somestring
2015-09-15 12:04:09 | 2015-09-15 12:04:09| 1     | xyz
2015-09-15 12:05:09 | 2015-09-15 12:05:11| 3     | xyz <--
2015-09-15 12:06:09 | 2015-09-15 12:06:09| 1     | xyz

so if there is a duplicate in a 5 sec interval I want to group it to one entry. but if there are several entries in 1 hour with each max. 5 seconds apart I want to count all of these entries too.

does somebody know a way? I am working at this for weeks now :(

EDIT: answer and comment for Bernd: Thank you so much, the output of your query is: thank you @Bernd. The Problem is, the output with your query is:

+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
| mycount | st                  | group_number | tmp_interv          | id | start_date          | end_date            | some_text | cnt  |
+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
|       2 | 2015-09-14 12:00:05 | 0            | 2015-09-14 12:00:05 |  1 | 2015-09-14 12:00:00 | 2015-09-14 12:00:00 | some      |    1 |
|       4 | 2015-09-14 12:00:05 | 1            | 2015-09-14 12:01:08 |  3 | 2015-09-14 12:01:03 | 2015-09-14 12:01:03 | some      |    1 |
|       1 | 2015-09-14 12:01:08 | 2            | 2015-09-14 12:01:14 |  7 | 2015-09-14 12:01:09 | 2015-09-14 12:01:09 | some      |    1 |
+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+

but it should be something like:

+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
| mycount | st                  | group_number | tmp_interv          | id | start_date          | end_date            | some_text | cnt  |
+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
|       2 | 2015-09-14 12:00:05 | 0            | 2015-09-14 12:00:05 |  1 | 2015-09-14 12:00:00 | 2015-09-14 12:00:03 | some      |    1 |
|       5 | 2015-09-14 12:00:05 | 1            | 2015-09-14 12:01:08 |  3 | 2015-09-14 12:01:03 | 2015-09-14 12:01:09 | some      |    1 |
+---------+---------------------+--------------+---------------------+----+--------------------+----------------------+-----------+------+

be aware of the count and date_end :)


回答1:


Here my first try:

CREATE TABLE `dtable` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `start_date` timestamp NULL DEFAULT NULL,
  `end_date` timestamp NULL DEFAULT NULL,
  `some_text` varchar(32) DEFAULT NULL,
  `cnt` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8;

INSERT INTO `dtable` (`id`, `start_date`, `end_date`, `some_text`, `cnt`)
VALUES
    (1, '2015-09-14 12:00:00', '2015-09-14 12:00:00', 'some', 1),
    (2, '2015-09-14 12:00:03', '2015-09-14 12:00:03', 'some', 1),
    (3, '2015-09-14 12:01:03', '2015-09-14 12:01:03', 'some', 1),
    (4, '2015-09-14 12:01:04', '2015-09-14 12:01:03', 'some', 1),
    (5, '2015-09-14 12:01:05', '2015-09-14 12:01:03', 'some', 1),
    (6, '2015-09-14 12:01:08', '2015-09-14 12:01:08', 'some', 1),
    (7, '2015-09-14 12:01:09', '2015-09-14 12:01:09', 'some', 1);


SELECT sum(cnt) mycount, t.* FROM (
  SELECT
    @interval_end:=IF(@interval_end = 0, d.start_date + INTERVAL 5 SECOND, @interval_end ) st,
    @group_nr:= IF( d.start_date > @interval_end, @group_nr:=@group_nr+1, @group_nr ) group_number,
    @interval_end:= IF( d.start_date > @interval_end, d.start_date + INTERVAL 5 SECOND , @interval_end ) tmp_interv,
    d.* FROM dtable d,(SELECT @group_nr:=0, @interval_end:=0) tmp
) AS t
GROUP BY t.group_number;  

Please check and say whats wrong




回答2:


I found the solution, than you so much Bernd, I modified your query for my purpose:

SELECT

sum(cnt) mycount, t.start_date, max(t.start_date) date_end, t.* FROM (
  SELECT
    @interval_end:=IF(@interval_end = 0, d.start_date + INTERVAL 5 SECOND, @interval_end ) st,
    @group_nr:= IF( d.start_date > @interval_end, @group_nr:=@group_nr+1, @group_nr ) group_number,
    @interval_end:= IF( d.start_date > @interval_end, d.start_date + INTERVAL 5 SECOND , d.start_date + INTERVAL 5 SECOND) tmp_interv,
    d.* FROM dtable d,(SELECT @group_nr:=0, @interval_end:=0) tmp
) AS t
GROUP BY t.group_number;  


来源:https://stackoverflow.com/questions/32585265/group-by-a-variable-range-in-mysql-mariadb

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!