How to quickly select DISTINCT dates from a Date/Time field, SQL Server

后端 未结 10 2113
别跟我提以往
别跟我提以往 2021-01-31 22:42

I am wondering if there is a good-performing query to select distinct dates (ignoring times) from a table with a datetime field in SQL Server.

My problem isn\'t getting

相关标签:
10条回答
  • 2021-01-31 22:54

    I'm not sure why your existing query would take over 5s for 40,000 rows.

    I just tried the following query against a table with 100,000 rows and it returned in less than 0.1s.

    SELECT DISTINCT DATEADD(day, 0, DATEDIFF(day, 0, your_date_column))
    FROM your_table
    

    (Note that this query probably won't be able to take advantage of any indexes on the date column, but it should be reasonably quick, assuming that you're not executing it dozens of times per second.)

    0 讨论(0)
  • 2021-01-31 22:55

    I've used the following:

    CAST(FLOOR(CAST(@date as FLOAT)) as DateTime);
    

    This removes the time from the date by converting it to a float and truncating off the "time" part, which is the decimal of the float.

    Looks a little clunky but works well on a large dataset (~100,000 rows) I use repeatedly throughout the day.

    0 讨论(0)
  • 2021-01-31 22:58

    Update:

    Solution below tested for efficiency on a 2M table and takes but 40 ms.

    Plain DISTINCT on an indexed computed column took 9 seconds.

    See this entry in my blog for performance details:

    • SQL Server: efficient DISTINCT on dates

    Unfortunately, SQL Server's optimizer can do neither Oracle's SKIP SCAN nor MySQL's INDEX FOR GROUP-BY.

    It's always Stream Aggregate that takes long.

    You can built a list of possible dates using a recursive CTE and join it with your table:

    WITH    rows AS (
            SELECT  CAST(CAST(CAST(MIN(date) AS FLOAT) AS INTEGER) AS DATETIME) AS mindate, MAX(date) AS maxdate
            FROM    mytable
            UNION ALL
            SELECT  mindate + 1, maxdate
            FROM    rows
            WHERE   mindate < maxdate
            )
    SELECT  mindate
    FROM    rows
    WHERE   EXISTS
            (
            SELECT  NULL
            FROM    mytable
            WHERE   date >= mindate
                    AND date < mindate + 1
            )
    OPTION  (MAXRECURSION 0)
    

    This will be more efficient than Stream Aggregate

    0 讨论(0)
  • 2021-01-31 23:01

    If you want to avoid the step extraction or reformatting the date - which is presumably the main cause of the delay (by forcing a full table scan) - you've no alternative but to store the date only part of the datetime, which unfortunately will require an alteration to the database structure.

    If your using SQL Server 2005 or later then a persisted computed field is the way to go

    Unless otherwise specified, computed columns are virtual columns that are
    not physically stored in the table. Their values are recalculated every 
    time they are referenced in a query. The Database Engine uses the PERSISTED 
    keyword in the CREATE TABLE and ALTER TABLE statements to physically store 
    computed columns in the table. Their values are updated when any columns 
    that are part of their calculation change. By marking a computed column as 
    PERSISTED, you can create an index on a computed column that is deterministic
    but not precise. 
    
    0 讨论(0)
  • 2021-01-31 23:03

    What is your predicate on that other filtered column ? Have you tried whether you get improvement from an index on that other filtered column, followed by the datetime field ?

    I'm largely guessing here, but 5 seconds to filter a set of perhaps 100000 rows down to 40000 and then doing a sort (which is presumably what goes on) doesn't seem like an unreasonable time to me. Why do you say it's too slow ? Because it doesn't match expectations ?

    0 讨论(0)
  • 2021-01-31 23:03

    Just convert the date: dateadd(dd,0, datediff(dd,0,[Some_Column]))

    0 讨论(0)
提交回复
热议问题