问题
I have a history log and i want to summarize some entries. The interval should be 5 seconds
For example: I have a list
date_start | date_end | count | somestring
2015-09-15 12:04:09 | 2015-09-15 12:04:09| 1 | xyz
2015-09-15 12:05:09 | 2015-09-15 12:05:09| 1 | xyz
2015-09-15 12:05:10 | 2015-09-15 12:05:10| 1 | xyz
2015-09-15 12:05:11 | 2015-09-15 12:05:11| 1 | xyz
2015-09-15 12:06:09 | 2015-09-15 12:06:09| 1 | xyz
I want now to have an output like
date_start | date_end | count | somestring
2015-09-15 12:04:09 | 2015-09-15 12:04:09| 1 | xyz
2015-09-15 12:05:09 | 2015-09-15 12:05:11| 3 | xyz <--
2015-09-15 12:06:09 | 2015-09-15 12:06:09| 1 | xyz
so if there is a duplicate in a 5 sec interval I want to group it to one entry. but if there are several entries in 1 hour with each max. 5 seconds apart I want to count all of these entries too.
does somebody know a way? I am working at this for weeks now :(
EDIT: answer and comment for Bernd: Thank you so much, the output of your query is: thank you @Bernd. The Problem is, the output with your query is:
+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
| mycount | st | group_number | tmp_interv | id | start_date | end_date | some_text | cnt |
+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
| 2 | 2015-09-14 12:00:05 | 0 | 2015-09-14 12:00:05 | 1 | 2015-09-14 12:00:00 | 2015-09-14 12:00:00 | some | 1 |
| 4 | 2015-09-14 12:00:05 | 1 | 2015-09-14 12:01:08 | 3 | 2015-09-14 12:01:03 | 2015-09-14 12:01:03 | some | 1 |
| 1 | 2015-09-14 12:01:08 | 2 | 2015-09-14 12:01:14 | 7 | 2015-09-14 12:01:09 | 2015-09-14 12:01:09 | some | 1 |
+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
but it should be something like:
+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
| mycount | st | group_number | tmp_interv | id | start_date | end_date | some_text | cnt |
+---------+---------------------+--------------+---------------------+----+---------------------+---------------------+-----------+------+
| 2 | 2015-09-14 12:00:05 | 0 | 2015-09-14 12:00:05 | 1 | 2015-09-14 12:00:00 | 2015-09-14 12:00:03 | some | 1 |
| 5 | 2015-09-14 12:00:05 | 1 | 2015-09-14 12:01:08 | 3 | 2015-09-14 12:01:03 | 2015-09-14 12:01:09 | some | 1 |
+---------+---------------------+--------------+---------------------+----+--------------------+----------------------+-----------+------+
be aware of the count and date_end :)
回答1:
Here my first try:
CREATE TABLE `dtable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`start_date` timestamp NULL DEFAULT NULL,
`end_date` timestamp NULL DEFAULT NULL,
`some_text` varchar(32) DEFAULT NULL,
`cnt` int(11) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8;
INSERT INTO `dtable` (`id`, `start_date`, `end_date`, `some_text`, `cnt`)
VALUES
(1, '2015-09-14 12:00:00', '2015-09-14 12:00:00', 'some', 1),
(2, '2015-09-14 12:00:03', '2015-09-14 12:00:03', 'some', 1),
(3, '2015-09-14 12:01:03', '2015-09-14 12:01:03', 'some', 1),
(4, '2015-09-14 12:01:04', '2015-09-14 12:01:03', 'some', 1),
(5, '2015-09-14 12:01:05', '2015-09-14 12:01:03', 'some', 1),
(6, '2015-09-14 12:01:08', '2015-09-14 12:01:08', 'some', 1),
(7, '2015-09-14 12:01:09', '2015-09-14 12:01:09', 'some', 1);
SELECT sum(cnt) mycount, t.* FROM (
SELECT
@interval_end:=IF(@interval_end = 0, d.start_date + INTERVAL 5 SECOND, @interval_end ) st,
@group_nr:= IF( d.start_date > @interval_end, @group_nr:=@group_nr+1, @group_nr ) group_number,
@interval_end:= IF( d.start_date > @interval_end, d.start_date + INTERVAL 5 SECOND , @interval_end ) tmp_interv,
d.* FROM dtable d,(SELECT @group_nr:=0, @interval_end:=0) tmp
) AS t
GROUP BY t.group_number;
Please check and say whats wrong
回答2:
I found the solution, than you so much Bernd, I modified your query for my purpose:
SELECT
sum(cnt) mycount, t.start_date, max(t.start_date) date_end, t.* FROM (
SELECT
@interval_end:=IF(@interval_end = 0, d.start_date + INTERVAL 5 SECOND, @interval_end ) st,
@group_nr:= IF( d.start_date > @interval_end, @group_nr:=@group_nr+1, @group_nr ) group_number,
@interval_end:= IF( d.start_date > @interval_end, d.start_date + INTERVAL 5 SECOND , d.start_date + INTERVAL 5 SECOND) tmp_interv,
d.* FROM dtable d,(SELECT @group_nr:=0, @interval_end:=0) tmp
) AS t
GROUP BY t.group_number;
来源:https://stackoverflow.com/questions/32585265/group-by-a-variable-range-in-mysql-mariadb