How to quickly select DISTINCT dates from a Date/Time field, SQL Server

后端未结

关注

 10  2113

I am wondering if there is a good-performing query to select distinct dates (ignoring times) from a table with a datetime field in SQL Server.

My problem isn\'t getting

相关标签:

10条回答

失恋的感觉

2021-01-31 22:54
I'm not sure why your existing query would take over 5s for 40,000 rows.

I just tried the following query against a table with 100,000 rows and it returned in less than 0.1s.
```
SELECT DISTINCT DATEADD(day, 0, DATEDIFF(day, 0, your_date_column))
FROM your_table
```
(Note that this query probably won't be able to take advantage of any indexes on the date column, but it should be reasonably quick, assuming that you're not executing it dozens of times per second.)
0 讨论(0)
发布评论:

提交评论
- 加载中...
轻奢々

2021-01-31 22:55
I've used the following:
```
CAST(FLOOR(CAST(@date as FLOAT)) as DateTime);
```
This removes the time from the date by converting it to a float and truncating off the "time" part, which is the decimal of the float.

Looks a little clunky but works well on a large dataset (~100,000 rows) I use repeatedly throughout the day.
0 讨论(0)
发布评论:

提交评论
- 加载中...
说谎

2021-01-31 22:58
Update:

Solution below tested for efficiency on a 2M table and takes but 40 ms.

Plain DISTINCT on an indexed computed column took 9 seconds.

See this entry in my blog for performance details:
- SQL Server: efficient DISTINCT on dates
Unfortunately, SQL Server's optimizer can do neither Oracle's SKIP SCAN nor MySQL's INDEX FOR GROUP-BY.

It's always Stream Aggregate that takes long.

You can built a list of possible dates using a recursive CTE and join it with your table:
```
WITH    rows AS (
        SELECT  CAST(CAST(CAST(MIN(date) AS FLOAT) AS INTEGER) AS DATETIME) AS mindate, MAX(date) AS maxdate
        FROM    mytable
        UNION ALL
        SELECT  mindate + 1, maxdate
        FROM    rows
        WHERE   mindate < maxdate
        )
SELECT  mindate
FROM    rows
WHERE   EXISTS
        (
        SELECT  NULL
        FROM    mytable
        WHERE   date >= mindate
                AND date < mindate + 1
        )
OPTION  (MAXRECURSION 0)
```
This will be more efficient than Stream Aggregate
0 讨论(0)
发布评论:

提交评论
- 加载中...

悲哀的现实

2021-01-31 23:01

If you want to avoid the step extraction or reformatting the date - which is presumably the main cause of the delay (by forcing a full table scan) - you've no alternative but to store the date only part of the datetime, which unfortunately will require an alteration to the database structure.

If your using SQL Server 2005 or later then a persisted computed field is the way to go

Unless otherwise specified, computed columns are virtual columns that are
not physically stored in the table. Their values are recalculated every 
time they are referenced in a query. The Database Engine uses the PERSISTED 
keyword in the CREATE TABLE and ALTER TABLE statements to physically store 
computed columns in the table. Their values are updated when any columns 
that are part of their calculation change. By marking a computed column as 
PERSISTED, you can create an index on a computed column that is deterministic
but not precise.

0 讨论(0)

予麋鹿

2021-01-31 23:03

What is your predicate on that other filtered column ? Have you tried whether you get improvement from an index on that other filtered column, followed by the datetime field ?

I'm largely guessing here, but 5 seconds to filter a set of perhaps 100000 rows down to 40000 and then doing a sort (which is presumably what goes on) doesn't seem like an unreasonable time to me. Why do you say it's too slow ? Because it doesn't match expectations ?

0 讨论(0)
发布评论:

提交评论
- 加载中...
青春惊慌失措

2021-01-31 23:03

Just convert the date: dateadd(dd,0, datediff(dd,0,[Some_Column]))

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页