I have this MySQL query:
SELECT DAYOFYEAR(`date`) AS d, COUNT(*)
FROM `orders`
WHERE `hasPaid` > 0
GROUP BY d
ORDER BY d
Which re
Perhaps a simpler solution for you and prevents the database having to do a ton of queries. This executes just one query then does a little math on the results in a single pass.
SET @runtot:=0;
SELECT
q1.d,
q1.c,
(@runtot := @runtot + q1.c) AS rt
FROM
(SELECT
DAYOFYEAR(`date`) AS d,
COUNT(*) AS c
FROM `orders`
WHERE `hasPaid` > 0
GROUP BY d
ORDER BY d) AS q1
This will give you an additional RT (running total) column. Don't miss the SET statement at the top to initialize the running total variable first or you will just get a column of NULL values.
Starting with MySQL 8, you will be using window functions for this kind of query:
SELECT dayofyear(`date`) AS d, count(*), sum(count(*)) OVER (ORDER BY dayofyear(`date`))
FROM `orders`
WHERE `hasPaid` > 0
GROUP BY d
ORDER BY d
In the above query, the aggregate function count(*)
is nested inside of the window function sum(..) OVER (..)
, which is possible because of the logical order of operations in SQL. If that's too confusing, you can easily resort to using a derived table or a WITH clause to better structure your query:
WITH daily (d, c) AS (
SELECT dayofyear(`date`) AS d, count(*)
FROM `orders`
WHERE `hasPaid` > 0
GROUP BY d
)
SELECT d, c, sum(c) OVER (ORDER BY d)
ORDER BY d
This is one of the only places where cursors are faster than a set based queries, if performance is critical I would either
Unless you have no other option but doing this in sql, I'd sum the results in the programming language that is making the query. A nesting like this will became very slow as the table grows.
It is possible to calculate a running balance using a temporary table in MySQL. The following query should work:
CREATE TEMPORARY table orders_temp1 (SELECT id, DAYOFYEAR(`date`) AS d, COUNT(*) as total FROM `orders` WHERE `hasPaid` > 0 GROUP BY d ORDER BY d);
CREATE TEMPORARY table orders_temp2 (SELECT * FROM orders_temp1);
SELECT d, total, (SELECT SUM(t2.total) FROM orders_temp2 t2 WHERE t2.id<=t1.id) as running_total FROM orders_temp1 t1;
A temporary table is used for organizing the query. Note that a temporary table only exists for the duration of the connection to the MySQL server
The above query uses a sub query, which returns balance of all rows in the temporary table upto and including the current row. The balance is assigned to the current row in the actual table
SELECT
DAYOFYEAR(O.`date`) AS d,
COUNT(*),
(select count(*) from `orders`
where DAYOFYEAR(`date`) <= d and `hasPaid` > 0)
FROM
`orders` as O
WHERE
O.`hasPaid` > 0
GROUP BY d
ORDER BY d
This will require some syntactical tuning (I don't have MySQL to test it), but it shows you the idea. THe subquery just has to go back and add up everything fresh that you already included in the outer query, and it has to do that for every row.
Take a look at this question for how to use joins to accomplish the same.
To address concerns about performance degradation with growing data: Since there are max. 366 days in a year, and I assume that you are not running this query against multiple years, the subquery will get evaluated up to 366 times. With proper indices on the date and the hasPaid flag, you'll be ok.