Just wondering if any of you people use Count(1)
over Count(*)
and if there is a noticeable difference in performance or if this is just a legacy h
COUNT(*)
and COUNT(1)
are same in case of result and performance.
There is an article showing that the COUNT(1)
on Oracle is just an alias to COUNT(*)
, with a proof about that.
I will quote some parts:
There is a part of the database software that is called “The Optimizer”, which is defined in the official documentation as “Built-in database software that determines the most efficient way to execute a SQL statement“.
One of the components of the optimizer is called “the transformer”, whose role is to determine whether it is advantageous to rewrite the original SQL statement into a semantically equivalent SQL statement that could be more efficient.
Would you like to see what the optimizer does when you write a query using COUNT(1)?
With a user with ALTER SESSION
privilege, you can put a tracefile_identifier
, enable the optimizer tracing and run the COUNT(1)
select, like: SELECT /* test-1 */ COUNT(1) FROM employees;
.
After that, you need to localize the trace files, what can be done with SELECT VALUE FROM V$DIAG_INFO WHERE NAME = 'Diag Trace';
. Later on the file, you will find:
SELECT COUNT(*) “COUNT(1)” FROM “COURSE”.”EMPLOYEES” “EMPLOYEES”
As you can see, it's just an alias for COUNT(*)
.
Another important comment: the COUNT(*)
was really faster two decades ago on Oracle, before Oracle 7.3:
Count(1) has been rewritten in count(*) since 7.3 because Oracle like to Auto-tune mythic statements. In earlier Oracle7, oracle had to evaluate (1) for each row, as a function, before DETERMINISTIC and NON-DETERMINISTIC exist.
So two decades ago, count(*) was faster
For another databases as Sql Server, it should be researched individually for each one.
I know that this question is specific for Sql Server, but the other questions on SO about the same subject, without mention the database, was closed and marked as duplicated from this answer.
Clearly, COUNT(*)
and COUNT(1)
will always return the same result. Therefore, if one were slower than the other it would effectively be due to an optimiser bug. Since both forms are used very frequently in queries, it would make no sense for a DBMS to allow such a bug to remain unfixed. Hence you will find that the performance of both forms is (probably) identical in all major SQL DBMSs.
In SQL Server, these statements yield the same plans.
Contrary to the popular opinion, in Oracle they do too.
SYS_GUID()
in Oracle is quite computation intensive function.
In my test database, t_even
is a table with 1,000,000
rows
This query:
SELECT COUNT(SYS_GUID())
FROM t_even
runs for 48
seconds, since the function needs to evaluate each SYS_GUID()
returned to make sure it's not a NULL
.
However, this query:
SELECT COUNT(*)
FROM (
SELECT SYS_GUID()
FROM t_even
)
runs for but 2
seconds, since it doen't even try to evaluate SYS_GUID()
(despite *
being argument to COUNT(*)
)
There is no difference.
Reason:
Books on-line says "
COUNT ( { [ [ ALL | DISTINCT ] expression ] | * } )
"
"1" is a non-null expression: so it's the same as COUNT(*)
.
The optimizer recognizes it for what it is: trivial.
The same as EXISTS (SELECT * ...
or EXISTS (SELECT 1 ...
Example:
SELECT COUNT(1) FROM dbo.tab800krows
SELECT COUNT(1),FKID FROM dbo.tab800krows GROUP BY FKID
SELECT COUNT(*) FROM dbo.tab800krows
SELECT COUNT(*),FKID FROM dbo.tab800krows GROUP BY FKID
Same IO, same plan, the works
Edit, Aug 2011
Similar question on DBA.SE.
Edit, Dec 2011
COUNT(*)
is mentioned specifically in ANSI-92 (look for "Scalar expressions 125
")
Case:
a) If COUNT(*) is specified, then the result is the cardinality of T.
That is, the ANSI standard recognizes it as bleeding obvious what you mean. COUNT(1)
has been optimized out by RDBMS vendors because of this superstition. Otherwise it would be evaluated as per ANSI
b) Otherwise, let TX be the single-column table that is the result of applying the <value expression> to each row of T and eliminating null values. If one or more null values are eliminated, then a completion condition is raised: warning-
In the SQL-92 Standard, COUNT(*)
specifically means "the cardinality of the table expression" (could be a base table, `VIEW, derived table, CTE, etc).
I guess the idea was that COUNT(*)
is easy to parse. Using any other expression requires the parser to ensure it doesn't reference any columns (COUNT('a')
where a
is a literal and COUNT(a)
where a
is a column can yield different results).
In the same vein, COUNT(*)
can be easily picked out by a human coder familiar with the SQL Standards, a useful skill when working with more than one vendor's SQL offering.
Also, in the special case SELECT COUNT(*) FROM MyPersistedTable;
, the thinking is the DBMS is likely to hold statistics for the cardinality of the table.
Therefore, because COUNT(1)
and COUNT(*)
are semantically equivalent, I use COUNT(*)
.