i want to optimize this query,
select location_id, dept_id,
round(sum(sales),0), sum(qty),
count(distinct tran_id),
now()
from tra
None of the suggestions so far will help much, because...
KEY(tran_date)
-- a waste; it is better to use the PK, which starts with tran_date
.PARTITIONing
-- No. That is likely to be slower.tran_date
(or otherwise rearranging the PK) -- This will hurt. The filtering (WHERE
) is on tran_date
; it is usually best to have that first.COUNT(*)
fast? Well, start by looking at the EXPLAIN
. It will show that it used KEY(tran_date)
instead of scanning the table. Less data to scan, hence faster.The real issue is that you have millions of rows to scan, it takes time to touch millions of rows.
How to speed it up? Create and maintain a Summary table . Then query that table (with thousands of rows) instead of the original table (millions of rows). Total count is SUM(counts)
; total sum is SUM(sums)
; average is SUM(sums)/SUM(counts)
, etc.
For this query:
select location_id, dept_id,
round(sum(sales), 0), sum(qty), count(distinct tran_id),
now()
from tran_sales
where tran_date <= '2016-12-24'
group by location_id, dept_id;
There is not much you can do. One attempt would be a covering index: (tran_date, location_id, dept_id, sales, qty)
, but I don't think that will help much.