问题
I am trying to write to datatable to excel which has large records. I am trying to achieve using divide and conquer strategy where each thread is assigned to write to respective sheets of excelworkbook.but I am getting file is readonly ,click Ok to override the file.
class Program
{
int processorCount = 2;
static volatile bool processing = true;
DataTable employeeTable = new DataTable("Employee");
ManualResetEvent mre = new ManualResetEvent(false);
AutoResetEvent ar = new AutoResetEvent(true);
int record_count;
static void Main(string[] args)
{
Program p = new Program();
//Create an Emplyee DataTable
p.employeeTable.Columns.Add("Employee ID");
p.employeeTable.Columns.Add("Employee Name");
for (int i = 0; i <= 2; i++)
{
p.employeeTable.Rows.Add(i.ToString(), "ABC");
}
p.record_count = p.employeeTable.Rows.Count / p.processorCount;
Excel.Application excelApp = new Excel.Application();
//Create an Excel workbook instance and open it from the predefined location
Excel.Workbook excelWorkBook1 = excelApp.Workbooks.Open(@"F:\Org.xlsx");
Thread[] threads = new Thread[3];
for (int i = 0; i < 3; i++)
{
// p.ExportDataSetToExcel(i);
ParameterizedThreadStart ps = new ParameterizedThreadStart(p.ExportDataSetToExcel);
threads[i] = new Thread(ps);
threads[i].Start(new Custom() { sheetNo = i, excelWorkBook = excelWorkBook1 });
}
for (int j = 0; j < 3; j++)
{
threads[j].Join();
}
Console.WriteLine("Succeess");
Console.ReadKey();
}
private void ExportDataSetToExcel(object sheet1)
{
lock (this)
{
bool found = false;
Excel.Worksheet excelWorkSheet;
int sheetNo = ((Custom)sheet1).sheetNo;
Excel.Workbook excelWorkBook = ((Custom)sheet1).excelWorkBook;
excelWorkSheet = (excelWorkBook).Sheets["Sheet" + ((int)sheetNo + 1).ToString()];
for (int i = 1; i < employeeTable.Columns.Count + 1; i++)
{
excelWorkSheet.Cells[1, i] = employeeTable.Columns[i - 1].ColumnName;
}
int baseIndex = (int)sheetNo * record_count;
for (int j = baseIndex; j < baseIndex + record_count; j++)
{
for (int k = 0; k < employeeTable.Columns.Count; k++)
{
excelWorkSheet.Cells[j + 2, k + 1] = employeeTable.Rows[j].ItemArray[k].ToString();
}
}
Console.WriteLine(sheetNo.ToString());
Console.WriteLine("\n");
(excelWorkBook).Save();
(excelWorkBook).Close();
}
}
}**strong text**
public class Custom
{
public int sheetNo;
public Excel.Workbook excelWorkBook;
}
回答1:
Instead of using interop either through OLE or VSTO, use a library like EPPlus, NPOI or use the Open XML SDK directly to create the Excel file.
Interop forces you to work on a single thread and you always pay both the CPU interop cost, the wasted CPU and memory to run Excel and finally the CPU and IO to save the file.
The Open XML SDK and the other libraries on the other hand don't even need Excel. All operations are in-memory and you only pay the CPU and IO cost to save the file. As a result they are orders of magnitude faster
As a result, you can use them in web and server applications, where using Interop and VSTO is impossible
EPPlus has some nice features, like creating Excel tables from DataTable (LoadFromDataTable) or LINQ queries (LoadFromCollection), which makes exporting data very easy, eg:
using (var excelFile = new ExcelPackage(targetFile))
{
var worksheet = excelFile.Workbook.Worksheets.Add("Sheet1");
var tableRange=worksheet.Cells["A1"].LoadFromCollection(employees, true);
excelFile.Save();
}
UPDATE
I just read in a comment that the OP wants to export a large number of rows and thought Excel has some limit. That's not true, but the scenario is completely different to begin with.
Excel doesn't have any restrictions on the number of rows since 2010. It can handle multiple sources with several million rows each through PowerPivot/PowerQuery, as long as a machine has enough memory. In 2010 there was an artificial limit of 2GB on the file size (to accomodate SharePoint) but I think that was removed in 2013. That a huge size, because PowerPivot uses the same column compression as Analysis Services.
The best option in this case is to create an Excel file with a PowerPivot connection, give it to the users and have them refresh the data whenever they want.
Unfortunately, this is a feature of Excel, not the file format. This means that you can't use the SDK to create a file with column-compressed data but have to resort to interop/VSTO again. In this case though, it's Excel that does the heavy lifting of pulling and compressing the data.
回答2:
Unfortunately Excel isn't designed to be multi-threaded. But what I recommend is that you have the writes written to be more effective. Writing cell by cell is the greatest part of the slowdown.
Elimination of those two factors (organizing data and writing it) will decrease the actual write time to where it may possibly eliminate the need to write it concurrently.
I had a old VSTO project where I had to write datasets from the database and I distilled the data into a two dimensional array then wrote the whole array to a region on the sheet such as this:
Microsoft.Office.Tools.Excel.Worksheet TheSheet;
private void PublishToSheet( int totalRows, int maxColumns, ref string[,] OutputArray )
{
Excel.Range Range = TheSheet.Range["A1", TheSheet.Cells[totalRows, maxColumns]];
Range.NumberFormat = "@";
Range.Value2 = OutputArray;
LastRow = totalRows;
LastColumn = maxColumns;
}
来源:https://stackoverflow.com/questions/28344273/write-to-excel-file-with-multiple-threads