While loop in SQL Server 2008 iterating through a date-range and then INSERT

后端 未结 1 717
醉梦人生
醉梦人生 2020-12-31 05:37

I have a table with a few columns, one of which is a Timestamp column. But currently in this table there is not a record for each day. Meaning, there are records for January

相关标签:
1条回答
  • 2020-12-31 06:28

    SQL is a set based language and loops should be a last resort. So the set based approach would be to first generate all the dates you require and insert them in one go, rather than looping and inserting one at a time. Aaron Bertrand has written a great series on generating a set or sequence without loops:

    • Generate a set or sequence without loops – part 1
    • Generate a set or sequence without loops – part 2
    • Generate a set or sequence without loops – part 3

    Part 3 is specifically relevant as it deals with dates.

    Assuming you don't have a Calendar table you can use the stacked CTE method to generate a list of dates between your start and end dates.

    DECLARE @StartDate DATE = '2015-01-01',
            @EndDate DATE = GETDATE();
    
    WITH N1 (N) AS (SELECT 1 FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) n (N)),
    N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
    N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2)
    SELECT TOP (DATEDIFF(DAY, @StartDate, @EndDate) + 1)
            Date = DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY N) - 1, @StartDate)
    FROM N3;
    

    I have skipped some detail on how this works as it is covered in the linked article, in essence it starts with a hard coded table of 10 rows, then joins this table with itself to get 100 rows (10 x 10) then joins this table of 100 rows to itself to get 10,000 rows (I stopped at this point but if you require further rows you can add further joins).

    At each step the output is a single column called N with a value of 1 (to keep things simple). At the same time as defining how to generate 10,000 rows, I actually tell SQL Server to only generate the number needed by using TOP and the difference between your start and end date - TOP(DATEDIFF(DAY, @StartDate, @EndDate) + 1). This avoids unnecessary work. I had to add 1 to the difference to ensure both dates were included.

    Using the ranking function ROW_NUMBER() I add an incremental number to each of the rows generated, then I add this incremental number to your start date to get the list of dates. Since ROW_NUMBER() begins at 1, I need to deduct 1 from this to ensure the start date is included.

    Then it would just be a case of excluding dates that already exist using NOT EXISTS. I have enclosed the results of the above query in their own CTE called dates:

    DECLARE @StartDate DATE = '2015-01-01',
            @EndDate DATE = GETDATE();
    
    WITH N1 (N) AS (SELECT 1 FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) n (N)),
    N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
    N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
    Dates AS
    (   SELECT TOP (DATEDIFF(DAY, @StartDate, @EndDate) + 1)
                Date = DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY N) - 1, @StartDate)
        FROM N3
    )
    INSERT INTO MyTable ([TimeStamp])
    SELECT  Date
    FROM    Dates AS d
    WHERE NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE d.Date = t.[TimeStamp])
    

    Example on SQL Fiddle


    If you were to create a calendar table (as described in the linked articles) then it may not be necessary to insert these extra rows, you could just generate your result set on the fly, something like:

    SELECT  [Timestamp] = c.Date,
            t.[FruitType],
            t.[NumOffered],
            t.[NumTaken],
            t.[NumAbandoned],
            t.[NumSpoiled]
    FROM    dbo.Calendar AS c
            LEFT JOIN dbo.MyTable AS t
                ON t.[Timestamp] = c.[Date]
    WHERE   c.Date >= @StartDate
    AND     c.Date < @EndDate;
    

    ADDENDUM

    To answer your actual question your loop would be written as follows:

    DECLARE @StartDate AS DATETIME
    DECLARE @EndDate AS DATETIME
    DECLARE @CurrentDate AS DATETIME
    
    SET @StartDate = '2015-01-01'
    SET @EndDate = GETDATE()
    SET @CurrentDate = @StartDate
    
    WHILE (@CurrentDate < @EndDate)
    BEGIN
        IF NOT EXISTS (SELECT 1 FROM myTable WHERE myTable.Timestamp = @CurrentDate)
        BEGIN
            INSERT INTO MyTable ([Timestamp])
            VALUES (@CurrentDate);
        END
    
        SET @CurrentDate = DATEADD(DAY, 1, @CurrentDate); /*increment current date*/
    END
    

    Example on SQL Fiddle

    I do not advocate this approach, just because something is only being done once does not mean that I should not demonstrate the correct way of doing it.


    FURTHER EXPLANATION

    Since the stacked CTE method may have over complicated the set based approach I will simplify it by using the undocumented system table master..spt_values. If you run:

    SELECT Number
    FROM master..spt_values
    WHERE Type = 'P';
    

    You will see that you get all the numbers from 0 -2047.

    Now if you run:

    DECLARE @StartDate DATE = '2015-01-01',
            @EndDate DATE = GETDATE();
    
    
    SELECT Date = DATEADD(DAY, number, @StartDate)
    FROM master..spt_values
    WHERE type = 'P';
    

    You get all the dates from your start date to 2047 days in the future. If you add a further where clause you can limit this to dates before your end date:

    DECLARE @StartDate DATE = '2015-01-01',
            @EndDate DATE = GETDATE();
    
    
    SELECT Date = DATEADD(DAY, number, @StartDate)
    FROM master..spt_values
    WHERE type = 'P'
    AND DATEADD(DAY, number, @StartDate) <= @EndDate;
    

    Now you have all the dates you need in a single set based query you can eliminate the rows that already exist in your table using NOT EXISTS

    DECLARE @StartDate DATE = '2015-01-01',
            @EndDate DATE = GETDATE();
    
    
    SELECT Date = DATEADD(DAY, number, @StartDate)
    FROM master..spt_values
    WHERE type = 'P'
    AND DATEADD(DAY, number, @StartDate) <= @EndDate
    AND NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE t.[Timestamp] = DATEADD(DAY, number, @StartDate));
    

    Finally you can insert these dates into your table using INSERT

    DECLARE @StartDate DATE = '2015-01-01',
            @EndDate DATE = GETDATE();
    
    INSERT YourTable ([Timestamp])
    SELECT Date = DATEADD(DAY, number, @StartDate)
    FROM master..spt_values
    WHERE type = 'P'
    AND DATEADD(DAY, number, @StartDate) <= @EndDate
    AND NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE t.[Timestamp] = DATEADD(DAY, number, @StartDate));
    

    Hopefully this goes some way to showing that the set based approach is not only much more efficient it is simpler too.

    0 讨论(0)
提交回复
热议问题