Should I COUNT(*) or not?

前端 未结 14 1368
爱一瞬间的悲伤
爱一瞬间的悲伤 2021-01-30 15:49

I know it\'s generally a bad idea to do queries like this:

SELECT * FROM `group_relations`

But when I just want the count, should I go for this

14条回答
  •  不思量自难忘°
    2021-01-30 16:34

    I was curious about this myself. It's all fine to read documentation and theoretical answers, but I like to balance those with empirical evidence.

    I have a MySQL table (InnoDB) that has 5,607,997 records in it. The table is in my own private sandbox, so I know the contents are static and nobody else is using the server. I think this effectively removes all outside affects on performance. I have a table with an auto_increment Primary Key field (Id) that I know will never be null that I will use for my where clause test (WHERE Id IS NOT NULL).

    The only other possible glitch I see in running tests is the cache. The first time a query is run will always be slower than subsequent queries that use the same indexes. I'll refer to that below as the cache Seeding call. Just to mix it up a little I ran it with a where clause I know will always evaluate to true regardless of any data (TRUE = TRUE).

    That said here are my results:

    QueryType

          |  w/o WHERE          | where id is not null |  where true=true
    

    COUNT()

          |  9 min 30.13 sec ++ | 6 min 16.68 sec ++   | 2 min 21.80 sec ++
          |  6 min 13.34 sec    | 1 min 36.02 sec      | 2 min 0.11 sec 
          |  6 min 10.06 se     | 1 min 33.47 sec      | 1 min 50.54 sec
    

    COUNT(Id)

          |  5 min 59.87 sec    | 1 min 34.47 sec      | 2 min 3.96 sec 
          |  5 min 44.95 sec    | 1 min 13.09 sec      | 2 min 6.48 sec
    

    COUNT(1)

          | 6 min 49.64 sec    | 2 min 0.80 sec       | 2 min 11.64 sec
          | 6 min 31.64 sec    | 1 min 41.19 sec      | 1 min 43.51 sec
    

    ++This is considered the cache Seeding call. It is expected to be slower than the rest.

    I'd say the results speak for themselves. COUNT(Id) usually edges out the others. Adding a Where clause dramatically decreases the access time even if it's a clause you know will evaluate to true. The sweet spot appears to be COUNT(Id)... WHERE Id IS NOT NULL.

    I would love to see other peoples' results, perhaps with smaller tables or with where clauses against different fields than the field you're counting. I'm sure there are other variations I haven't taken into account.

提交回复
热议问题