I am wondering if there is a good-performing query to select distinct dates (ignoring times) from a table with a datetime field in SQL Server.
My problem isn\'t getting
Every option that involves CAST or TRUNCATE or DATEPART manipulation on the datetime field has the same problem: the query has to scan the entire resultset (the 40k) in order to find the distinct dates. Performance may vary marginally between various implementaitons.
What you really need is to have an index that can produce the response in a blink. You can either have a persisted computed column with and index that (requires table structure changes) or an indexed view (requires Enterprise Edition for QO to consider the index out-of-the-box).
Persisted computed column:
alter table foo add date_only as convert(char(8), [datetimecolumn], 112) persisted;
create index idx_foo_date_only on foo(date_only);
Indexed view:
create view v_foo_with_date_only
with schemabinding as
select id
, convert(char(8), [datetimecolumn], 112) as date_only
from dbo.foo;
create unique clustered index idx_v_foo on v_foo_with_date_only(date_only, id);
Update
To completely eliminate the scan one could use an GROUP BY tricked indexed view, like this:
create view v_foo_with_date_only
with schemabinding as
select
convert(char(8), [d], 112) as date_only
, count_big(*) as [dummy]
from dbo.foo
group by convert(char(8), [d], 112)
create unique clustered index idx_v_foo on v_foo_with_date_only(date_only)
The query select distinct date_only from foo
will use this indexed view instead. Is still a scan technically, but on an already 'distinct' index, so only the needed records are scanned. Its a hack, I reckon, I would not recommend it for live production code.
AFAIK SQL Server does not have the capability of scanning a true index with skipping repeats, ie. seek top, then seek greater than top, then succesively seek greater than last found.