Using cursors to read time series data from SQL Server using C#?

问题

I have a large database (50 million rows) containing time series data. There is a clustered index on the [datetime] column which ensures that that the table is always sorted in chronological order.

What is the most performant way to read the rows of the table out into a C# app, on a row-by-row basis?

回答1:

You should try this and find out. I just did, and saw no performance issues.

USE [master]
GO
/****** Object:  Database [HugeDatabase]    Script Date: 06/27/2011 13:27:50 ******/
CREATE DATABASE [HugeDatabase] ON  PRIMARY 
( NAME = N'HugeDatabase', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.SQL2K8R2\MSSQL\DATA\HugeDatabase.mdf' , SIZE = 1940736KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )
 LOG ON 
( NAME = N'HugeDatabase_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL10_50.SQL2K8R2\MSSQL\DATA\HugeDatabase_log.LDF' , SIZE = 395392KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)
GO

USE [HugeDatabase]
GO
/****** Object:  Table [dbo].[HugeTable]    Script Date: 06/27/2011 13:27:53 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[HugeTable](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [PointInTime] [datetime] NULL,
PRIMARY KEY NONCLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [IX_HugeTable_PointInTime] ON [dbo].[HugeTable] 
(
    [PointInTime] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

Populate:

DECLARE @t datetime
SET @t = '2011-01-01'

DECLARE @i int
SET @i=0

SET NOCOUNT ON

WHILE (@i < 50000000)
BEGIN
    INSERT INTO HugeTable(PointInTime) VALUES(@t)
    SET @t = DATEADD(ss, 1, @t)

    SET @i = @i + 1
END

Test:

using System;
using System.Data.SqlClient;
using System.Diagnostics;

namespace ConsoleApplication1
{
    internal class Program
    {
        private static void Main()
        {
            TimeSpan firstRead = new TimeSpan();
            TimeSpan readerOpen = new TimeSpan();
            TimeSpan commandOpen = new TimeSpan();
            TimeSpan connectionOpen = new TimeSpan();
            TimeSpan secondRead = new TimeSpan();

            try
            {

                Stopwatch sw1 = new Stopwatch();
                sw1.Start();
                using (
                    var conn =
                        new SqlConnection(
                            @"Data Source=.\sql2k8r2;Initial Catalog=HugeDatabase;Integrated Security=True"))
                {
                    conn.Open(); connectionOpen = sw1.Elapsed;

                    using (var cmd = new SqlCommand(
                        "SELECT * FROM HugeTable ORDER BY PointInTime", conn))
                    {
                        commandOpen = sw1.Elapsed;

                        var reader = cmd.ExecuteReader(); readerOpen = sw1.Elapsed;

                        reader.Read(); firstRead = sw1.Elapsed;
                        reader.Read(); secondRead = sw1.Elapsed;
                    }
                }
                sw1.Stop();
            }
            catch (Exception e)
            {
                Console.WriteLine(e);
            }
            finally
            {

                Console.WriteLine(
                    "Connection: {0}, command: {1}, reader: {2}, read: {3}, second read: {4}",
                    connectionOpen,
                    commandOpen - connectionOpen,
                    readerOpen - commandOpen,
                    firstRead - readerOpen,
                    secondRead - firstRead);

                Console.Write("Enter to exit: ");
                Console.ReadLine();
            }
        }
    }
}

回答2:

I would use a SqlDataReader as it streams its results. You'll still have to specify the ordering but if you're using the clustered index to ORDER BY it should be a (relatively) cheap operation.

using (var db = new SqlConnection(connStr)) {
    using (var rs = new SqlCommand(someQuery, db).ExecuteReader()) {
        while (rs.Read()) {
            // do interesting things!
        }
    }
}

回答3:

See Row Offset in SQL Server, which mentions:

Efficiently Paging Through Large Result Sets in SQL Server 2000
A More Efficient Method for Paging Through Large Result Sets
Paging Records Using SQL Server 2005 Database - ROW_NUMBER Function

来源：https://stackoverflow.com/questions/6468925/using-cursors-to-read-time-series-data-from-sql-server-using-c

标签

.net

tsql

sql-server-2008

smo