Fastest way for this query (What is the best strategy) given a date range

前端 未结 10 2345
半阙折子戏
半阙折子戏 2021-01-02 11:35

I have a table A that has a startDate and an end dateDate as 2 datetime columns besides some more other columns. I have another table B that has one datetime column call it

相关标签:
10条回答
  • 2021-01-02 12:23

    The following script will list possible missing indices (you may filter the statement by t.name).

    SELECT     t.name AS 'affected_table',
               'Create NonClustered Index IX_' + t.name + '_missing_' + CAST(ddmid.index_handle AS VARCHAR(10)) + ' On ' + ddmid.STATEMENT + ' (' + ISNULL(ddmid.equality_columns, '') +
               CASE
                          WHEN ddmid.equality_columns IS NOT NULL
                                 AND ddmid.inequality_columns IS NOT NULL
                          THEN ','
                          ELSE ''
               END + ISNULL(ddmid.inequality_columns, '') + ')' + ISNULL(' Include (' + ddmid.included_columns + ');', ';') AS sql_statement,
               ddmigs.user_seeks,
               ddmigs.user_scans,
               CAST((ddmigs.user_seeks + ddmigs.user_scans) * ddmigs.avg_user_impact AS INT) AS 'est_impact',
               ddmigs.last_user_seek
    FROM       sys.dm_db_missing_index_groups      AS ddmig
    INNER JOIN sys.dm_db_missing_index_group_stats AS ddmigs
    ON         ddmigs.group_handle = ddmig.index_group_handle
    INNER JOIN sys.dm_db_missing_index_details AS ddmid
    ON         ddmig.index_handle = ddmid.index_handle
    INNER JOIN sys.tables AS t
    ON         ddmid.OBJECT_ID = t.OBJECT_ID
    WHERE      ddmid.database_id = DB_ID()
           AND CAST((ddmigs.user_seeks + ddmigs.user_scans) * ddmigs.avg_user_impact AS INT) > 100
    ORDER BY   CAST((ddmigs.user_seeks + ddmigs.user_scans) * ddmigs.avg_user_impact AS INT) DESC;
    
    0 讨论(0)
  • 2021-01-02 12:23

    I would just add a clustered index on B.dates. If you add indexes on startDate and endDate it won't buy anything because you'll get index scans on A anyway. The clustered index on B gets you an index seek in B at least. A Table Scan and Index Scan are the same thing so there's no point in adding indexes to get the word Table Scan out of your execution plan :)

    I'd mock it up a few ways or see if you can redo your query to not require a table scan on A which I'm guessing isn't really possible.

    0 讨论(0)
  • 2021-01-02 12:29

    A little more information is needed. How many other columns are in the tables? Are these existing tables with lots of queries already going against them, or all new tables? What kind of performance issue are you seeing that lead you to ask the question?

    I'm assuming that all three columns are NOT NULL (not just for the query syntax, but for the index usefulness).

    I would start with a compound index on A.startDate + A.endDate, and another index on B.dates (but this is likely not needed). Unless these dates are the primary purpose of the tables, I would avoid creating clustered indices on these columns. This is doubly true if these tables are existing tables with other queries running against them. Previous queries may be written expecting the existing clustered indices.

    0 讨论(0)
  • 2021-01-02 12:34

    I have worked at two companies (both doing time and attendance management systems) that have lots of times with startDate and endDate columns. In my experience there is no good indexes that always works with date ranges.

    Try indexes like (startDate, -endDate) and (-endDate, startDate) to see if they help, a lot depends on what the data in the table is like. E.g if you tend to have lots of old rows with an endDate before the dates you are looking for, forcing Sql to use an index based on (endDate, startDate) may help.

    Also try using an index that covers all columns that are in your “where” statement, so sql does not need to read the main table until it has worked out what rows to return.

    You may have to use index hints, as it is unlikely that the query processor knows enough about the data to make a good choose of indexes – this is one of very few cases when I have had to consider index hints.

    Expanding the data, so you have a table that contains (date, rowed) with a row for each date within the date range may be needed. However keeping the "index" table updated is a pain.

    If you know that some of your date ranges don't overlap, have a look at Using CROSS APPLY to optimize joins on BETWEEN conditions (E.g an employee's sickness records may not be allowed to overlap)

    At the end of the day if you only have several thousand records, a full table scan is not to bad.

    Quassnoi subjects using SPATIAL indexes, I have no experience with ”abusing” spatial indexes in this way, but I think it is worth trying. However be very careful if you will have to every support multiply database vendors, as spatial index are rather new. Also you may still need the date columns for reporting tools etc.

    (Sooner or later will need to be able to find all rows that overlaps a date range, then it become even harder to get indexes that returns good results.)

    0 讨论(0)
提交回复
热议问题