I have a table with a few columns, one of which is a Timestamp column. But currently in this table there is not a record for each day. Meaning, there are records for January
SQL is a set based language and loops should be a last resort. So the set based approach would be to first generate all the dates you require and insert them in one go, rather than looping and inserting one at a time. Aaron Bertrand has written a great series on generating a set or sequence without loops:
Part 3 is specifically relevant as it deals with dates.
Assuming you don't have a Calendar table you can use the stacked CTE method to generate a list of dates between your start and end dates.
DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();
WITH N1 (N) AS (SELECT 1 FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) n (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2)
SELECT TOP (DATEDIFF(DAY, @StartDate, @EndDate) + 1)
Date = DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY N) - 1, @StartDate)
FROM N3;
I have skipped some detail on how this works as it is covered in the linked article, in essence it starts with a hard coded table of 10 rows, then joins this table with itself to get 100 rows (10 x 10) then joins this table of 100 rows to itself to get 10,000 rows (I stopped at this point but if you require further rows you can add further joins).
At each step the output is a single column called N
with a value of 1 (to keep things simple). At the same time as defining how to generate 10,000 rows, I actually tell SQL Server to only generate the number needed by using TOP
and the difference between your start and end date - TOP(DATEDIFF(DAY, @StartDate, @EndDate) + 1)
. This avoids unnecessary work. I had to add 1 to the difference to ensure both dates were included.
Using the ranking function ROW_NUMBER() I add an incremental number to each of the rows generated, then I add this incremental number to your start date to get the list of dates. Since ROW_NUMBER()
begins at 1, I need to deduct 1 from this to ensure the start date is included.
Then it would just be a case of excluding dates that already exist using NOT EXISTS
. I have enclosed the results of the above query in their own CTE called dates
:
DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();
WITH N1 (N) AS (SELECT 1 FROM (VALUES (1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) n (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
N3 (N) AS (SELECT 1 FROM N2 AS N1 CROSS JOIN N2 AS N2),
Dates AS
( SELECT TOP (DATEDIFF(DAY, @StartDate, @EndDate) + 1)
Date = DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY N) - 1, @StartDate)
FROM N3
)
INSERT INTO MyTable ([TimeStamp])
SELECT Date
FROM Dates AS d
WHERE NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE d.Date = t.[TimeStamp])
Example on SQL Fiddle
If you were to create a calendar table (as described in the linked articles) then it may not be necessary to insert these extra rows, you could just generate your result set on the fly, something like:
SELECT [Timestamp] = c.Date,
t.[FruitType],
t.[NumOffered],
t.[NumTaken],
t.[NumAbandoned],
t.[NumSpoiled]
FROM dbo.Calendar AS c
LEFT JOIN dbo.MyTable AS t
ON t.[Timestamp] = c.[Date]
WHERE c.Date >= @StartDate
AND c.Date < @EndDate;
ADDENDUM
To answer your actual question your loop would be written as follows:
DECLARE @StartDate AS DATETIME
DECLARE @EndDate AS DATETIME
DECLARE @CurrentDate AS DATETIME
SET @StartDate = '2015-01-01'
SET @EndDate = GETDATE()
SET @CurrentDate = @StartDate
WHILE (@CurrentDate < @EndDate)
BEGIN
IF NOT EXISTS (SELECT 1 FROM myTable WHERE myTable.Timestamp = @CurrentDate)
BEGIN
INSERT INTO MyTable ([Timestamp])
VALUES (@CurrentDate);
END
SET @CurrentDate = DATEADD(DAY, 1, @CurrentDate); /*increment current date*/
END
I do not advocate this approach, just because something is only being done once does not mean that I should not demonstrate the correct way of doing it.
FURTHER EXPLANATION
Since the stacked CTE method may have over complicated the set based approach I will simplify it by using the undocumented system table master..spt_values
. If you run:
SELECT Number
FROM master..spt_values
WHERE Type = 'P';
You will see that you get all the numbers from 0 -2047.
Now if you run:
DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();
SELECT Date = DATEADD(DAY, number, @StartDate)
FROM master..spt_values
WHERE type = 'P';
You get all the dates from your start date to 2047 days in the future. If you add a further where clause you can limit this to dates before your end date:
DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();
SELECT Date = DATEADD(DAY, number, @StartDate)
FROM master..spt_values
WHERE type = 'P'
AND DATEADD(DAY, number, @StartDate) <= @EndDate;
Now you have all the dates you need in a single set based query you can eliminate the rows that already exist in your table using NOT EXISTS
DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();
SELECT Date = DATEADD(DAY, number, @StartDate)
FROM master..spt_values
WHERE type = 'P'
AND DATEADD(DAY, number, @StartDate) <= @EndDate
AND NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE t.[Timestamp] = DATEADD(DAY, number, @StartDate));
Finally you can insert these dates into your table using INSERT
DECLARE @StartDate DATE = '2015-01-01',
@EndDate DATE = GETDATE();
INSERT YourTable ([Timestamp])
SELECT Date = DATEADD(DAY, number, @StartDate)
FROM master..spt_values
WHERE type = 'P'
AND DATEADD(DAY, number, @StartDate) <= @EndDate
AND NOT EXISTS (SELECT 1 FROM MyTable AS t WHERE t.[Timestamp] = DATEADD(DAY, number, @StartDate));
Hopefully this goes some way to showing that the set based approach is not only much more efficient it is simpler too.