Should I COUNT(*) or not?

前端 未结 14 1337
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-01-30 15:49

I know it\'s generally a bad idea to do queries like this:

SELECT * FROM `group_relations`

But when I just want the count, should I go for this

14条回答
  •  清酒与你
    2021-01-30 16:20

    Seek Alternatives

    As you've seen, when tables grow large, COUNT queries get slow. I think the most important thing is to consider the nature of the problem you're trying to solve. For example, many developers use COUNT queries when generating pagination for large sets of records in order to determine the total number of pages in the result set.

    Knowing that COUNT queries will grow slow, you could consider an alternative way to display pagination controls that simply allows you to side-step the slow query. Google's pagination is an excellent example.

    Denormalize

    If you absolutely must know the number of records matching a specific count, consider the classic technique of data denormalization. Instead of counting the number of rows at lookup time, consider incrementing a counter on record insertion, and decrementing that counter on record deletion.

    If you decide to do this, consider using idempotent, transactional operations to keep those denormalized values in synch.

    BEGIN TRANSACTION;
    INSERT INTO  `group_relations` (`group_id`) VALUES (1);
    UPDATE `group_relations_count` SET `count` = `count` + 1;
    COMMIT;
    

    Alternatively, you could use database triggers if your RDBMS supports them.

    Depending on your architecture, it might make sense to use a caching layer like memcached to store, increment and decrement the denormalized value, and simply fall through to the slow COUNT query when the cache key is missing. This can reduce overall write-contention if you have very volatile data, though in cases like this, you'll want to consider solutions to the dog-pile effect.

提交回复
热议问题