问题
I have an application that receives sms messages. What i want to do is make a statistic with mysql that will count meessages in a hour. For example in 7 am i received 10 sms messages, in 8 am i received 20 etc. My table has this columns ID, smsText, smsDate ... (others are not important). When i run this script:
SELECT HOUR(smsDate), COUNT(ID) FROM SMS_MESSAGES GROUP BY HOUR(smsDate)
it show how many messages i get in every hour. The problem is when i dont receive any message for example in 5pm, this statement does't return a row 17 with count 0, and i have a result like this:
Hour Count
...
15 10
16 5
18 2
...
, and what i want to get is this
Hour Count
...
15 10
16 5
17 0
18 2
...
I searched for a solution on the web, something with UNION but i don't understand how to implement that one in mine. Hope someone can help me.
回答1:
You could create a table with all hours and join the tables:
CREATE TABLE IF NOT EXISTS `hours` (
`hour` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `hours` (`hour`) VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12), (13), (14), (15), (16), (17), (18), (19), (20), (21), (22), (23);
SELECT hours.hour, count( SMS_MESSAGES.ID )
FROM hours
LEFT JOIN SMS_MESSAGES ON ( hours.hour = HOUR( SMS_MESSAGES.smsDate ) )
GROUP BY 1
回答2:
As hellocode has answered with creating a new table which contains hours values is a good approach, here is another way to achieve this by using union
select t.`hour`,count(s.ID) from (
select 0 as `hour`
union
select 1 as `hour`
union
select 2 as `hour`
union
.
.
.
select 23 as `hour`
) t
left join SMS_MESSAGES s on(t.`hour` = hour(s.smsDate))
group by t.`hour`
回答3:
Observation: HOUR()
simply extracts the hour from a timestamp. You may want date and hour in your query. This answer provides date and hour.
You need a way to get a virtual table containing all the hourly timestamps in the appropriate range. You then need to join that table to your aggregate query.
First things first: Here’s a query that will get the timestamps in the range.
SELECT mintime + INTERVAL seq.seq HOUR AS msghour
FROM (
SELECT MIN(DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR) AS mintime,
MAX(DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR) AS maxtime
FROM SMS_MESSAGES
) AS minmax
JOIN seq_0_to_999999 AS seq ON seq.seq < TIMESTAMPDIFF(HOUR,mintime,maxtime)
What’s going on here? Three things.
First: DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR
converts any arbitrary timestamp into a timestamp at the top of the hour. This lets us fetch the first and last hourly timestamp in your table.
Second, we have a subquery which determines the first and last hour (min and max smsDate) we care about reporting.
Second, we have a table called seq_0_to_999999. It contains a sequence of cardinal numbers: the integers starting at zero. More about this in a moment.
Joining these two tables together, then using the expression
mintime + INTERVAL seq.seq HOUR AS msghour
we can fetch a table that has a continuous run of hourly timestamps.
Then we join that to your query. Here's where it starts to look more complex that it is. We're doing this, in outline:
SELECT DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR, COUNT(ID)
FROM SMS_MESSAGES
JOIN ( /*the query above wit the sequence of timestamps*/) AS sq
ON DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR = msghour
GROUP BY DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR
ORDER BY DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR
Putting it all together, it looks like this:
SELECT DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR, COUNT(ID)
FROM SMS_MESSAGES
JOIN (
SELECT mintime + INTERVAL seq.seq HOUR AS msghour
FROM (
SELECT MIN(DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR) AS mintime,
MAX(DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR) AS maxtime
FROM SMS_MESSAGES
) AS minmax
JOIN seq_0_to_999999 AS seq ON seq.seq < TIMESTAMPDIFF(HOUR,mintime,maxtime)
) AS sq
ON DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR = msghour
GROUP BY DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR
ORDER BY DATE(smsDate) + INTERVAL HOUR(smsDate) HOUR
That will give you a result set with timestamp and count for every hour in the range.
Finally, what about this seq_0_to_999999
sequence table? Where do we get those integers starting with zero? The answer is this: we have to arrange to do that; those numbers aren’t built in to MySQL (MariaDB v10+ does have them).
The simple way is to create a table with a whole lot of integers in it. That will take up storage, though, so we'll skip that.
Another way is to create a short table with the integers from 0-9 in it, like so:
DROP TABLE IF EXISTS seq_0_to_9;
CREATE TABLE seq_0_to_9 AS
SELECT 0 AS seq UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;
Then we can create a view that join that table with itself to generate 1000 combinations like this:
DROP VIEW IF EXISTS seq_0_to_999;
CREATE VIEW seq_0_to_999 AS (
SELECT (a.seq + 10 * (b.seq + 10 * c.seq)) AS seq
FROM seq_0_to_9 a
JOIN seq_0_to_9 b
JOIN seq_0_to_9 c
);
Finally, we can join that table of 1000 numbers with itself to create a view that will generate a million combinations like this:
DROP VIEW IF EXISTS seq_0_to_999999;
CREATE VIEW seq_0_to_999999 AS (
SELECT (a.seq + (1000 * b.seq)) AS seq
FROM seq_0_to_999 a
JOIN seq_0_to_999 b
);
Here's a writeup providing more information about all this. http://www.plumislandmedia.net/mysql/filling-missing-data-sequences-cardinal-integers/
来源:https://stackoverflow.com/questions/24305085/including-missing-zero-count-rows-when-using-group-by