SQL count(*) performance

后端未结

关注

 5  1503

I have a SQL table BookChapters with over 20 millions rows. It has a clustered primary key (bookChapterID) and doesn\'t have any other keys or indexes. It takes miliseconds

相关标签:

5条回答

太阳男子

2020-11-28 06:59
Mikael Eriksson has a good explanation bellow why the first query is fast:

SQL server optimize it into: if exists(select * from BookChapters). So it goes looking for the presence of one row instead of counting all the rows in the table.

For the other two queries, SQL Server would use the following rule. To perform a query like SELECT COUNT(*), SQL Server will use the narrowest non-clustered index to count the rows. If the table does not have any non-clustered index, it will have to scan the table.

Also, if your table has a clustered index you can get your count even faster using the following query (borrowed from this site Get Row Counts Fast!)
```
--SQL Server 2005/2008
SELECT OBJECT_NAME(i.id) [Table_Name], i.rowcnt [Row_Count]
FROM sys.sysindexes i WITH (NOLOCK)
WHERE i.indid in (0,1)
ORDER BY i.rowcnt desc

--SQL Server 2000
SELECT OBJECT_NAME(i.id) [Table_Name], i.rows [Row_Count]
FROM sysindexes i (NOLOCK)
WHERE i.indid in (0,1)
ORDER BY i.rows desc
```
It uses sysindexes system table. More info you can find here SQL Server 2000, SQL Server 2005, SQL Server 2008, SQL Server 2012

Here is another link Why is my SELECT COUNT(*) running so slow? with another solution. It shows technique that Microsoft uses to quickly display the number of rows when you right click on the table and select properties.
```
select sum (spart.rows)
from sys.partitions spart
where spart.object_id = object_id(’YourTable’)
and spart.index_id < 2
```
You should find that this returns very quickly no matter how many tables you have.

If you are using SQL 2000 still you can use the sysindexes table to get the number.
```
select max(ROWS)
from sysindexes
where id = object_id(’YourTable’)
```
This number may be slightly off depending on how often SQL updates the sysindexes table, but it’s usually corrent (or at least close enough).
0 讨论(0)
发布评论:

提交评论
- 加载中...
情歌与酒

2020-11-28 07:02

If you have a look at the execution plans for your queries you would see what is going on.

Your first query if (select count(*) from BookChapters) = 0 is recognized by the query optimizer as the same as if exists(select * from BookChapters). SQL Server knows that the expression is true if there are at least one row present so it goes looking for the presence of one row instead of counting all the rows in the table.

For your other queries it can't be that smart and have to count the number of rows in the table before it can decide if the expression evaluates to true or false.

0 讨论(0)
发布评论:

提交评论
- 加载中...
走了就别回头了

2020-11-28 07:15
Did you consider query select count(BookChapterId) from BookChapterTable ? - where `BookChapterId is a non-clustered index. That should make it run much faster.

Depending on how table is used and rows accessed, querying against non-clustered index might be the key point: I just took some points from MDSN:
- Before you create nonclustered indexes, understand how your data will be accessed. Consider using nonclustered indexes for:
- Columns that contain a large number of distinct values, such as a
  combination of last name and first name (if a clustered index is used for other columns). If there are very few distinct values, such as
  only 1 and 0, most queries will not use the index because a table
  scan is usually more efficient.
- Queries that do not return large result sets.
- Columns frequently involved in search conditions of a query (WHERE
  clause) that return exact matches.
- Decision-support-system applications for which joins and grouping are frequently required. Create multiple nonclustered indexes on columns involved in join and grouping operations, and a clustered index on any foreign key columns.
- Covering all columns from one table in a given query. This eliminates accessing the table or clustered index altogether.
0 讨论(0)
发布评论:

提交评论
- 加载中...
予麋鹿

2020-11-28 07:16
try this if you only want to know rows count:
```
exec sp_spaceused [TABLE_NAME]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
温柔的废话

2020-11-28 07:19
try this, if you need to detect, if the table has more rows than one:
```
if (SELECT COUNT(*) FROM (SELECT TOP 2 * FROM BookChapters) AS b) > 1
```
0 讨论(0)
发布评论:

提交评论
- 加载中...