MySQL SUM Query is extremely slow

问题

There is a table called transactions with ~6 million rows. Below query counts the current user balance. Here is the log after I enabled slow_query_log = 'ON' :

# Time: 170406  9:51:48
# User@Host: root[root] @  [xx.xx.xx.xx]
# Thread_id: 13  Schema: main_db  QC_hit: No
# Query_time: 38.924823  Lock_time: 0.000034  Rows_sent: 1  Rows_examined: 773550
# Rows_affected: 0
SET timestamp=1491456108;
SELECT SUM(`Transaction`.`amount`) as total
    FROM `main_db`.`transactions` AS `Transaction`
    WHERE `Transaction`.`user_id` = 1008
      AND `Transaction`.`confirmed` = 1
    LIMIT 1;

As you can see it took ~38 seconds !

Here is transactions table EXPLAIN :

This query sometimes run fast (about ~1 second) and sometimes really slow!

Any help would be great appreciated.

P.S:

It's InnoDB and transactions table has frequent INSERT and SELECT operations.

I tried running the query with SQL_NO_CACHE, but it is still sometimes fast, sometimes slow.

transactions Table Schema :

CREATE TABLE `transactions` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `user_id` int(10) unsigned NOT NULL,
  `ref_id` varchar(40) COLLATE utf8_persian_ci NOT NULL,
  `payment_id` tinyint(3) unsigned NOT NULL,
  `amount` decimal(10,1) NOT NULL,
  `created` datetime NOT NULL,
  `private_note` varchar(6000) COLLATE utf8_persian_ci NOT NULL,
  `public_note` varchar(200) COLLATE utf8_persian_ci NOT NULL,
  `confirmed` tinyint(3) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `user_id` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=13133663 DEFAULT CHARSET=utf8 COLLATE=utf8_persian_ci

MySQL is running on a VPS with 12GB RAM and 9 Logical CPU cores.

Here is a part of my.cnf :

# * InnoDB
#
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.
# Read the manual for more InnoDB related options. There are many!
default_storage_engine  = InnoDB
# you can't just change log file size, requires special procedure
innodb_buffer_pool_size = 9G
innodb_log_buffer_size  = 8M
innodb_file_per_table   = 1
innodb_open_files       = 400
innodb_io_capacity      = 400
innodb_flush_method     = O_DIRECT
innodb_thread_concurrency = 0
innodb_read_io_threads = 64
innodb_write_io_threads = 64


# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
#bind-address           = 127.0.0.1
#
# * Fine Tuning
#
max_connections         = 500
connect_timeout         = 5
wait_timeout            = 600
max_allowed_packet      = 16M
thread_cache_size       = 128
sort_buffer_size        = 4M
bulk_insert_buffer_size = 16M
tmp_table_size          = 32M
max_heap_table_size     = 32M

回答1:

(Yes, I am adding another answer. Justification: It addresses the underlying problem a different way.)

The underlying problem seems to be that there is an ever-growing "transaction" table from which is derived various statistics, such as SUM(amount). The performance of this will only get worse and worse as the table(s) grow.

The basis for this Answer will be to look at the data in two ways: "History" and "Current". Transactions is the History. A new table would be the Current totals for each User. But I see multiple ways to do that. Each involves some form of subtotal(s) so as to avoid adding 773K rows to get the answer.

The traditional banking way... Each night tally up the day's Transactions and add them to Current.
The Materialized View way... Each time a row is added to Transactions, increment Current.
Hybrid: Keep daily subtotals in a "Summary Table". Sum those subtotals to get the SUM through last night.

More discussion in my blog on Summary Tables.

Note that the up-to-the-second balance for the banking or hybrid way is a little tricky:

Get last night's amount
Add any Transactions that occurred during the day.

Any of the approaches will be a lot faster than scanning all 773K rows for the user, but it will be more complex code.

回答2:

(Sorry to step on all the good comments. I hope I have added enough to justify claiming an "Answer".)

Are there 6M rows in the table? But 773K rows with that user_id?

9GB buffer_pool? The table is about 4GB of data? So it fits in the buffer_pool if there is not much else to bump it out. (SHOW TABLE STATUS and check "Data_length".)

The existing INDEX(user_id) might be 20MB, easily cachable.

If the user_ids are sufficiently scattered around the table, the query probably needs to fetch virtually every 16KB block of the data. So, the original query with original index will go something like:

scan the index for the given user_id. This will be a minor part of the total effort.
For each entry in the index, look up (randomly) the record. This happens 1.5M times. With a "cold" cache, this would easily take 38 seconds or more. Where the "slow" times soon after a restart? Or something else that blows out the cache? With a "warm" cache, it is all CPU (no I/O), hence 1 second is reasonable.

If you change to the optimal, "covering", INDEX(user_id, confirmed, amount), things change some...

"Covering" means that the entire query would be performed in the index. (This composite index might be more like 40MB, but this is still very small compared to the data.)
In a "cold" cache, only 40MB would need fetching -- expect much better than 38s.
In a "warm" cache (this time only the 40MB), it might run in half a second.

If there were also a date range in the WHERE clause, I would push for building and maintaining a "Summary table". This might speed up similar queries by a factor of 10.

If you do add a composite index starting with user_id, you should (not must) DROP the index on just user_id as being redundant. (If you don't drop it, it will mostly waste disk space.)

As for doing it in production...

If you have a new enough version of MySQL, ALTER TABLE ... ALGORITHM=INPLACE ..., which is feasible for adding/dropping indexes with minimal impact.
For older versions, see pt-online-schema-change. It requires that there be no other Triggers, and does take a very short downtime. The Trigger takes care of the 200 writes/minute 'transparently'.

ALGORITHM=INPLACE was added in MySQL 5.6 and MariaDB 10.0.

回答3:

One thing you might try is adding an composite index to see if it speeds up the select part of the query:

ALTER TABLE `transactions` ADD INDEX `user_confirmed` (`user_id`, `confirmed`);

Also, as @wajeeh pointed out in a comment, the LIMIT clause is unnecessary here since you're already calling an aggregate function.

It would be helpful if you can post the table schema in your question as well.

回答4:

Take a look at this answer Any way to select without causing locking in MySQL?

And this article: Consistent Nonlocking Reads

In your case as I think and as @billynoah mention, the table have to do many write operations as it is as Log table, So this may help you.

来源：https://stackoverflow.com/questions/43246532/mysql-sum-query-is-extremely-slow

标签

mysql

performance

mysql-slow-query-log