Accessing Excel Spreadsheet with C# occasionally returns blank value for some cells

后端 未结 10 2121
無奈伤痛
無奈伤痛 2020-12-11 02:43

I need to access an excel spreadsheet and insert the data from the spreadsheet into a SQL Database. However the Primary Keys are mixed, most are numeric and some are alpha-n

相关标签:
10条回答
  • 2020-12-11 02:45

    Order the records in the xls file by ascii code in descending order so that alpha-numeric fields will appear at the top below the header row. This ensures that the first row of data read will define the data type as "varchar" or "nvarchar"

    0 讨论(0)
  • 2020-12-11 02:48

    Solution:

    1. You put HDR=No so that the first row is not considered the column header. Connection String: Provider=Microsoft.Jet.OLEDB.4.0;Data Source=FilePath;Extended Properties="Excel 8.0;HDR=No;IMEX=1";
    2. You ignore the first row and you acces the data by any means you want (DataTable, DataReader ect). You acces the columns by numeric indexes, instead of column names.

    It worked for me. This way you don't have to modify registers!

    0 讨论(0)
  • 2020-12-11 02:49

    The ItemArray is an Object Array. So I assume that the "column" in the DataRow, that I am trying to reference, is of type object.

    0 讨论(0)
  • 2020-12-11 02:50

    This isn't completely right! Apparently, Jet/ACE ALWAYS assumes a string type if the first 8 rows are blank, regardless of IMEX=1. Even when I made the rows read to 0 in the registry, I still had the same problem. This was the only sure fire way to get it to work:

    try
    {
        Console.Write(wsReader.GetDouble(j).ToString());
    }
    catch   //Lame unfixable bug
    {
        Console.Write(wsReader.GetString(j));
    }
    

    0 讨论(0)
  • 2020-12-11 02:58

    I answered a similar question here. Here I've copied and pasted the same answer for your convenience:

    I had this same problem, but was able to work around it without resorting to the Excel COM interface or 3rd party software. It involves a little processing overhead, but appears to be working for me.

    1. First read in the data to get the column names
    2. Then create a new DataSet with each of these columns, setting each of their DataTypes to string.
    3. Read the data in again into this new dataset. Voila - the scientific notation is now gone and everything is read in as a string.

    Here's some code that illustrates this, and as an added bonus, it's even StyleCopped!

    public void ImportSpreadsheet(string path)
    {
        string extendedProperties = "Excel 12.0;HDR=YES;IMEX=1";
        string connectionString = string.Format(
            CultureInfo.CurrentCulture,
            "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"{1}\"",
            path,
            extendedProperties);
    
        using (OleDbConnection connection = new OleDbConnection(connectionString))
        {
            using (OleDbCommand command = connection.CreateCommand())
            {
                command.CommandText = "SELECT * FROM [Worksheet1$]";
                connection.Open();
    
                using (OleDbDataAdapter adapter = new OleDbDataAdapter(command))
                using (DataSet columnDataSet = new DataSet())
                using (DataSet dataSet = new DataSet())
                {
                    columnDataSet.Locale = CultureInfo.CurrentCulture;
                    adapter.Fill(columnDataSet);
    
                    if (columnDataSet.Tables.Count == 1)
                    {
                        var worksheet = columnDataSet.Tables[0];
    
                        // Now that we have a valid worksheet read in, with column names, we can create a
                        // new DataSet with a table that has preset columns that are all of type string.
                        // This fixes a problem where the OLEDB provider is trying to guess the data types
                        // of the cells and strange data appears, such as scientific notation on some cells.
                        dataSet.Tables.Add("WorksheetData");
                        DataTable tempTable = dataSet.Tables[0];
    
                        foreach (DataColumn column in worksheet.Columns)
                        {
                            tempTable.Columns.Add(column.ColumnName, typeof(string));
                        }
    
                        adapter.Fill(dataSet, "WorksheetData");
    
                        if (dataSet.Tables.Count == 1)
                        {
                            worksheet = dataSet.Tables[0];
    
                            foreach (var row in worksheet.Rows)
                            {
                                // TODO: Consume some data.
                            }
                        }
                    }
                }
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-11 02:59

    Solution:

    Connection String:

    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=FilePath;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1";

    1. HDR=Yes; indicates that the first row contains columnnames, not data. HDR=No; indicates the opposite.

    2. IMEX=1; tells the driver to always read "intermixed" (numbers, dates, strings etc) data columns as text. Note that this option might affect excel sheet write access negative.

    SQL syntax SELECT * FROM [sheet1$]. I.e. excel worksheet name followed by a $ and wrapped in [ ] brackets.

    Important:

    • Check out the [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel] located registry REG_DWORD "TypeGuessRows". That's the key to not letting Excel use only the first 8 rows to guess the columns data type. Set this value to 0 to scan all rows. This might hurt performance.

    • If the Excel workbook is protected by a password, you cannot open it for data access, even by supplying the correct password with your connection string. If you try, you receive the following error message: "Could not decrypt file."

    0 讨论(0)
提交回复
热议问题