I\'ve seen many different ways to create and populate a numbers table. However, what is the best way to create and populate one? With \"best\" being defined from most to l
For anyone looking for an Azure solution
SET NOCOUNT ON
CREATE TABLE Numbers (n bigint PRIMARY KEY)
GO
DECLARE @numbers table(number int);
WITH numbers(number) as (
SELECT 1 AS number
UNION all
SELECT number+1 FROM numbers WHERE number<10000
)
INSERT INTO @numbers(number)
SELECT number FROM numbers OPTION(maxrecursion 10000)
INSERT INTO Numbers(n) SELECT number FROM @numbers
Sourced from the sql azure team blog http://azure.microsoft.com/blog/2010/09/16/create-a-numbers-table-in-sql-azure/
Here is a couple of extra methods:
Method 1
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE Numbers (Number int NOT NULL PRIMARY KEY);
GO
DECLARE @i int = 1;
INSERT INTO dbo.Numbers (Number)
VALUES (1),(2);
WHILE 2*@i < 1048576
BEGIN
INSERT INTO dbo.Numbers (Number)
SELECT Number + 2*@i
FROM dbo.Numbers;
SET @i = @@ROWCOUNT;
END
GO
SELECT COUNT(*) FROM Numbers AS RowCownt --1048576 rows
Method 2
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE dbo.Numbers (Number int NOT NULL PRIMARY KEY);
GO
DECLARE @i INT = 0;
INSERT INTO dbo.Numbers (Number)
VALUES (1);
WHILE @i <= 9
BEGIN
INSERT INTO dbo.Numbers (Number)
SELECT N.Number + POWER(4, @i) * D.Digit
FROM dbo.Numbers AS N
CROSS JOIN (VALUES(1),(2),(3)) AS D(Digit)
ORDER BY D.Digit, N.Number
SET @i = @i + 1;
END
GO
SELECT COUNT(*) FROM dbo.Numbers AS RowCownt --1048576 rows
Method 3
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE Numbers (Number int identity NOT NULL PRIMARY KEY, T bit NULL);
WITH
T1(T) AS (SELECT T FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10)) AS T(T)) --10 rows
,T2(T) AS (SELECT A.T FROM T1 AS A CROSS JOIN T1 AS B CROSS JOIN T1 AS C) --1,000 rows
,T3(T) AS (SELECT A.T FROM T2 AS A CROSS JOIN T2 AS B CROSS JOIN T2 AS C) --1,000,000,000 rows
INSERT INTO dbo.Numbers(T)
SELECT TOP (1048576) NULL
FROM T3;
ALTER TABLE Numbers
DROP COLUMN T;
GO
SELECT COUNT(*) FROM dbo.Numbers AS RowCownt --1048576 rows
Method 4, taken from Defensive Database Programming book by Alex Kuznetsov
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE Numbers (Number int NOT NULL PRIMARY KEY);
GO
DECLARE @i INT = 1 ;
INSERT INTO dbo.Numbers (Number)
VALUES (1);
WHILE @i < 524289 --1048576
BEGIN;
INSERT INTO dbo.Numbers (Number)
SELECT Number + @i
FROM dbo.Numbers;
SET @i = @i * 2 ;
END
GO
SELECT COUNT(*) FROM dbo.Numbers AS RowCownt --1048576 rows
Method 5, taken from Arrays and Lists in SQL Server 2005 and Beyond article by Erland Sommarskog
IF OBJECT_ID('dbo.Numbers', 'U') IS NOT NULL
DROP TABLE dbo.Numbers
GO
CREATE TABLE Numbers (Number int NOT NULL PRIMARY KEY);
GO
WITH digits (d) AS (
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL
SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL
SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 UNION ALL
SELECT 0)
INSERT INTO Numbers (Number)
SELECT Number
FROM (SELECT i.d + ii.d * 10 + iii.d * 100 + iv.d * 1000 +
v.d * 10000 + vi.d * 100000 AS Number
FROM digits i
CROSS JOIN digits ii
CROSS JOIN digits iii
CROSS JOIN digits iv
CROSS JOIN digits v
CROSS JOIN digits vi) AS Numbers
WHERE Number > 0
GO
SELECT COUNT(*) FROM dbo.Numbers AS RowCownt --999999 rows
Summary:
Among those 5 methods, method 3 seems to be the fastest.
I start with the following template, which is derived from numerous printings of Itzik Ben-Gan's routine:
;WITH
Pass0 as (select 1 as C union all select 1), --2 rows
Pass1 as (select 1 as C from Pass0 as A, Pass0 as B),--4 rows
Pass2 as (select 1 as C from Pass1 as A, Pass1 as B),--16 rows
Pass3 as (select 1 as C from Pass2 as A, Pass2 as B),--256 rows
Pass4 as (select 1 as C from Pass3 as A, Pass3 as B),--65536 rows
Pass5 as (select 1 as C from Pass4 as A, Pass4 as B),--4,294,967,296 rows
Tally as (select row_number() over(order by C) as Number from Pass5)
select Number from Tally where Number <= 1000000
The "WHERE N<= 1000000" clause limits the output to 1 to 1 million, and can easily be adjusted to your desired range.
Since this is a WITH clause, it can be worked into an INSERT... SELECT... like so:
-- Sample use: create one million rows
CREATE TABLE dbo.Example (ExampleId int not null)
DECLARE @RowsToCreate int
SET @RowsToCreate = 1000000
-- "Table of numbers" data generator, as per Itzik Ben-Gan (from multiple sources)
;WITH
Pass0 as (select 1 as C union all select 1), --2 rows
Pass1 as (select 1 as C from Pass0 as A, Pass0 as B),--4 rows
Pass2 as (select 1 as C from Pass1 as A, Pass1 as B),--16 rows
Pass3 as (select 1 as C from Pass2 as A, Pass2 as B),--256 rows
Pass4 as (select 1 as C from Pass3 as A, Pass3 as B),--65536 rows
Pass5 as (select 1 as C from Pass4 as A, Pass4 as B),--4,294,967,296 rows
Tally as (select row_number() over(order by C) as Number from Pass5)
INSERT Example (ExampleId)
select Number
from Tally
where Number <= @RowsToCreate
Indexing the table after it's built will be the fastest way to index it.
Oh, and I'd refer to it as a "Tally" table. I think this is a common term, and you can find loads of tricks and examples by Googling it.
i use this which is fast as hell:
insert into Numbers(N)
select top 1000000 row_number() over(order by t1.number) as N
from master..spt_values t1
cross join master..spt_values t2
I use numbers tables for primarily dummying up reports in BIRT without having to fiddle around with dynamic creation of recordsets.
I do the same with dates, having a table spanning from 10 years in the past to 10 years in the future (and hours of the day for more detailed reporting). It's a neat trick to be able to get values for all dates even if your 'real' data tables don't have data for them.
I have a script which I use to create these, something like (this is from memory):
drop table numbers; commit;
create table numbers (n integer primary key); commit;
insert into numbers values (0); commit;
insert into numbers select n+1 from numbers; commit;
insert into numbers select n+2 from numbers; commit;
insert into numbers select n+4 from numbers; commit;
insert into numbers select n+8 from numbers; commit;
insert into numbers select n+16 from numbers; commit;
insert into numbers select n+32 from numbers; commit;
insert into numbers select n+64 from numbers; commit;
The number of rows doubles with each line so it doesn't take a lot to produce truly huge tables.
I'm not sure I agree with you that it's important to be created fast since you only create it once. The cost of that is amortized over all the accesses to it, rendering that time fairly insignificant.