问题
I have an asp.net C# application where i am reading the contents of a spreadsheet using OLEDBConnection. I am using the below line of code to read from the excel spreadsheet.
OleDbConnection con = new OleDbConnection(@"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + fullFilePath + ";Extended Properties='Excel 8.0;HDR=Yes;IMEX=1'");
One of my column has data in various formats like strings,numbers,date etc in various rows.When running this ,When the data format is different,its not reading that value fromthe excel file. I searched in net a lot and found that we need to mention IMEX proprety in connection string.I added that,but no positive response !.
After surfine a lot, ifound that Any built-in Excel driver will query the first 8 rows of a sheet and then make a determination (without your permission or knowledge) as to what type of column it is, thereby ignoring anything that doesn’t meet this data type later in the sheet.
http://www.mattjwilson.com/blog/2009/02/13/microsoft-excel-drivers-and-imex/
Is there anyway to get rid of this problem ?
回答1:
You are running into one of the many fun features of the JET engine. This one will basically sample all the data in each row for a single column and it will try to guess the data format. If you want your code to "just work" then there is a registry setting that will help with this. However be forewarned that this registry setting will affect how JET works with all imports on a system, not just your particular import.
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel]
"ImportMixedTypes"="Text"
"TypeGuessRows"=dword:00000000
This registry setting will tell JET to check the format of every row in a column before guessing a format. If it finds mixed content it will import the row as text.
By default JET tests the first 25 rows when type guessing.
Alternatively you can change TypeGuessRows to 1 and JET will check the first row only when type guessing. That means if the first row is a number and the second row is a string JET will assume all rows are numbers and you will not be able to read them using ADO.NET
Another caveat: Make sure you are careful when editing your registry. You can decimate your system very quickly if you do not use care.
回答2:
Update: it seems like Microsoft really does not recommend using Excel COM services on servers. Still, many developers do, both on non-.NET (as my employer does) and .NET (see here) enviroment, as alternatives are costly. All problems are mostly solvable (aside from potential scalability and performance problems in high-volume applications and in some cases licence problems). Costly alternatives are using third-party solutions like this.
You should not use OleDbConnection when you have data of different datatypes in one column. You can try to read from Excel using Excel COM/OLE API, for example (compiled from here, may contain errors):
Include the following reference into the project :
Microsoft Excel 10.0 Object Library
Microsoft Office 10.0 Object Library
Include the name space Excel.
using Excel;
...
Excel.ApplicationClass xl = new Excel.Application();
xl.Visible = false;
xl.UserControl = false;
Excel.Workbook theWorkbook = xl.Workbooks.Open(
fileName, 0, true, 5,
"", "", true, Excel.XlPlatform.xlWindows, "\t", false, false,
0, true);
Excel.Sheets sheets = theWorkbook.Worksheets;
Excel.Worksheet worksheet = (Excel.Worksheet)sheets.get_Item(1);
System.Array myvalues;
Excel.Range range = worksheet.get_Range("A1", "E1".ToString());
myvalues = (System.Array)range.Cells.Value;
Important! You should free the resources used. From here:
// Need all following code to clean up and extingush all references!!!
theWorkbook.Close(null,null,null);
xl.Workbooks.Close();
xl.Quit();
System.Runtime.InteropServices.Marshal.ReleaseComObject (range);
System.Runtime.InteropServices.Marshal.ReleaseComObject (sheets);
System.Runtime.InteropServices.Marshal.ReleaseComObject (xl);
System.Runtime.InteropServices.Marshal.ReleaseComObject (worksheet);
System.Runtime.InteropServices.Marshal.ReleaseComObject (theWorkbook);
worksheet=null;
sheets=null;
theWorkbook=null;
xl = null;
GC.Collect(); // force final cleanup!
回答3:
SpreadsheetGear for .NET can read, write, calculate, etc... Excel workbooks and allows you to access the underlying data (number, text, logical, error) of any cell or the formatted text of any cell using APIs such as IWorksheet.Cells[rowIndex, colIndex].Value or IWorksheet.Cells[rowIndex, colIndex].Text. There is no limitation based on the type of data in each column / cell. SpreadsheetGear is 100% safe .NET code (no COM interop, no unsafe native calls, etc...) so it is easier to deploy than other options - especially in server scenarios.
You can see live samples here and download the free trial here.
Disclaimer: I own SpreadsheetGear LLC
回答4:
When everything else failed this is what I did... While importing from excel I specified HDR = NO
in connection string. This imported the header as the first row, thus making all the column datatypes as text. After that a simple function to mention the columnname for the datatable. Something like the below code...
private DataTable NameHeaderRows(DataTable dt)
{
for (int i = 0; i < dt.Columns.Count; i++)
{
dt.Columns[i].ColumnName = dt.Rows[0][i].ToString();
}
dt.Rows.RemoveAt(0);
return dt;
}
I know it is tedious, but did not find any feasible solution. Any other suggestion is a welcome.
来源:https://stackoverflow.com/questions/1699483/excel-reading-in-asp-net-data-not-being-read-if-column-has-different-data-form