How to speed up dumping a DataTable into an Excel worksheet?

孤者浪人 提交于 2020-01-09 19:23:26

问题


I have the following routine that dumps a DataTable into an Excel worksheet.

    private void RenderDataTableOnXlSheet(DataTable dt, Excel.Worksheet xlWk, 
                                    string [] columnNames, string [] fieldNames)
    {
        // render the column names (e.g. headers)
        for (int i = 0; i < columnNames.Length; i++)
            xlWk.Cells[1, i + 1] = columnNames[i];

        // render the data 
        for (int i = 0; i < fieldNames.Length; i++)
        {
            for (int j = 0; j < dt.Rows.Count; j++)
            {
                xlWk.Cells[j + 2, i + 1] = dt.Rows[j][fieldNames[i]].ToString();
            }
        }
    }

For whatever reason, dumping DataTable of 25 columns and 400 rows takes about 10-15 seconds on my relatively modern PC. Takes even longer testers' machines.

Is there anything I can do to speed up this code? Or is interop just inherently slow?

SOLUTION: Based on suggestions from Helen Toomik, I've modified the method and it should now work for several common data types (int32, double, datetime, string). Feel free to extend it. The speed for processing my dataset went from 15 seconds to under 1.

    private void RenderDataTableOnXlSheet(DataTable dt, Excel.Worksheet xlWk, string [] columnNames, string [] fieldNames)
    {
        Excel.Range rngExcel = null;
        Excel.Range headerRange = null;

        try
        {
            // render the column names (e.g. headers)
            for (int i = 0; i < columnNames.Length; i++)
                xlWk.Cells[1, i + 1] = columnNames[i];

            // for each column, create an array and set the array 
            // to the excel range for that column.
            for (int i = 0; i < fieldNames.Length; i++)
            {
                string[,] clnDataString = new string[dt.Rows.Count, 1];
                int[,] clnDataInt = new int[dt.Rows.Count, 1];
                double[,] clnDataDouble = new double[dt.Rows.Count, 1];

                string columnLetter = char.ConvertFromUtf32("A".ToCharArray()[0] + i);
                rngExcel = xlWk.get_Range(columnLetter + "2", Missing.Value);
                rngExcel = rngExcel.get_Resize(dt.Rows.Count, 1);

                string dataTypeName = dt.Columns[fieldNames[i]].DataType.Name;

                for (int j = 0; j < dt.Rows.Count; j++)
                {
                    if (fieldNames[i].Length > 0)
                    {
                        switch (dataTypeName)
                        {
                            case "Int32":
                                clnDataInt[j, 0] = Convert.ToInt32(dt.Rows[j][fieldNames[i]]);
                                break;
                            case "Double":
                                clnDataDouble[j, 0] = Convert.ToDouble(dt.Rows[j][fieldNames[i]]);
                                break;
                            case "DateTime":
                                if (fieldNames[i].ToLower().Contains("time"))
                                    clnDataString[j, 0] = Convert.ToDateTime(dt.Rows[j][fieldNames[i]]).ToShortTimeString();
                                else if (fieldNames[i].ToLower().Contains("date"))
                                    clnDataString[j, 0] = Convert.ToDateTime(dt.Rows[j][fieldNames[i]]).ToShortDateString();
                                else 
                                    clnDataString[j, 0] = Convert.ToDateTime(dt.Rows[j][fieldNames[i]]).ToString();

                                break;
                            default:
                                clnDataString[j, 0] = dt.Rows[j][fieldNames[i]].ToString();
                                break;
                        }
                    }
                    else
                        clnDataString[j, 0] = string.Empty;
                }

                // set values in the sheet wholesale.
                if (dataTypeName == "Int32") 
                    rngExcel.set_Value(Missing.Value, clnDataInt);
                else if (dataTypeName == "Double")
                    rngExcel.set_Value(Missing.Value, clnDataDouble);                             
                else
                    rngExcel.set_Value(Missing.Value, clnDataString);
            }


            // figure out the letter of the last column (supports 1 letter column names)
            string lastColumn = char.ConvertFromUtf32("A".ToCharArray()[0] + columnNames.Length - 1);

            // make the header range bold
            headerRange = xlWk.get_Range("A1", lastColumn + "1");
            headerRange.Font.Bold = true;

            // autofit for better view
            xlWk.Columns.AutoFit();

        }
        finally
        {
            ReleaseObject(headerRange);
            ReleaseObject(rngExcel);
        }
    }

    private void ReleaseObject(object obj)
    {
        try
        {
            System.Runtime.InteropServices.Marshal.ReleaseComObject(obj);
            obj = null;
        }
        catch
        {
            obj = null;
        }
        finally
        {
            GC.Collect();
        }
    }

回答1:


Instead of setting cell values one by one, do it in a batch.

Step 1. Transfer the data from your DataTable into an array with the same dimensions.

Step 2. Define an Excel Range object that spans the appropriate range.

Step 3. Set the Range.Value to the array.

This will be a lot faster because you will have a total two calls across the Interop boundary (one to get the Range object, one to set its value), instead of two per cell (get cell, set value).

There is some sample code at MSDN KB article 302096.




回答2:


Interop is inherently very slow. There is a large overhead associated with each call. To speed it up try writing back an object array of data to a range of cells in one assignment statement.

Or if this is a serious problem try using one of the Managed Code Excel extensions that can read/write data using managed code via the XLL interface. (Addin Express, Managed XLL etc.)




回答3:


If you have a recordset, the fastest way to write to Excel is CopyFromRecordset.




回答4:


Do you have a specific requirement to go the COM automation route? If not, you have a few other options.

  1. Use the OLEDB provider to create/write to an Excel file
    http://support.microsoft.com/kb/316934

  2. Use a third party library to write to Excel. Depending on your licensing requirements there are a few options. Update: A good free library is NPOI http://npoi.codeplex.com/

  3. Write the data to a csv file, and load that into Excel

  4. Write the data as XML which can be loaded into Excel.

  5. Use the Open XML SDK
    http://www.microsoft.com/downloads/details.aspx?familyid=C6E744E5-36E9-45F5-8D8C-331DF206E0D0&displaylang=en




回答5:


You could create an Excel add-in, with VBA code to do all your db heavy lifting. from .NET, all you'd need to do is instantiate Excel, add the add-in, and call the Excel VBA routine, passing any parameters to it that it needs to execute your SQL statements.




回答6:


I agree with Charles. Interop is really slow. But try this:

private void RenderDataTableOnXlSheet(DataTable dt, Excel.Worksheet xlWk, 
                                    string [] columnNames, string [] fieldNames)
{
    // render the column names (e.g. headers)
    int columnLength = columnNames.Length;
    for (int i = 0; i < columnLength; i++)
        xlWk.Cells[1, i + 1] = columnNames[i];

    // render the data 
        int fieldLength = fieldNames.Length;
        int rowCount = dt.Rows.Count;
        for (int j = 0; j < rowCount; j++)
        { 
            for (int i = 0; i < fieldLength; i++)
            {
                xlWk.Cells[j + 2, i + 1] = dt.Rows[j][fieldNames[i]].ToString();
            }
        }
}

HTH



来源:https://stackoverflow.com/questions/2692979/how-to-speed-up-dumping-a-datatable-into-an-excel-worksheet

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!