I am using ExcelDataReader to read data from my Excel workbook in C#.
But structure of my Excel sheet is such that data to be read can start from any particular cell and
If you are using ExcelDataReader 3+
you will find that there isn't any method for AsDataSet()
for your reader object, You need to also install another package for ExcelDataReader.DataSet, then you can use the AsDataSet()
method.
Also there is not a property for IsFirstRowAsColumnNames
instead you need to set it inside of ExcelDataSetConfiguration
.
Example:
using (var stream = File.Open(originalFileName, FileMode.Open, FileAccess.Read))
{
IExcelDataReader reader;
// Create Reader - old until 3.4+
////var file = new FileInfo(originalFileName);
////if (file.Extension.Equals(".xls"))
//// reader = ExcelDataReader.ExcelReaderFactory.CreateBinaryReader(stream);
////else if (file.Extension.Equals(".xlsx"))
//// reader = ExcelDataReader.ExcelReaderFactory.CreateOpenXmlReader(stream);
////else
//// throw new Exception("Invalid FileName");
// Or in 3.4+ you can only call this:
reader = ExcelDataReader.ExcelReaderFactory.CreateReader(stream)
//// reader.IsFirstRowAsColumnNames
var conf = new ExcelDataSetConfiguration
{
ConfigureDataTable = _ => new ExcelDataTableConfiguration
{
UseHeaderRow = true
}
};
var dataSet = reader.AsDataSet(conf);
// Now you can get data from each sheet by its index or its "name"
var dataTable = dataSet.Tables[0];
//...
}
You can find row number and column number of a cell reference like this:
var cellStr = "AB2"; // var cellStr = "A1";
var match = Regex.Match(cellStr, @"(?<col>[A-Z]+)(?<row>\d+)");
var colStr = match.Groups["col"].ToString();
var col = colStr.Select((t, i) => (colStr[i] - 64) * Math.Pow(26, colStr.Length - i - 1)).Sum();
var row = int.Parse(match.Groups["row"].ToString());
Now you can use some loops to read data from that cell like this:
for (var i = row; i < dataTable.Rows.Count; i++)
{
for (var j = col; j < dataTable.Columns.Count; j++)
{
var data = dataTable.Rows[i][j];
}
}
Update:
You can filter rows and columns of your Excel sheet at read time with this config:
var i = 0;
var conf = new ExcelDataSetConfiguration
{
UseColumnDataType = true,
ConfigureDataTable = _ => new ExcelDataTableConfiguration
{
FilterRow = rowReader => fromRow <= ++i - 1,
FilterColumn = (rowReader, colIndex) => fromCol <= colIndex,
UseHeaderRow = true
}
};
I found this useful to read from a specific column and row:
FileStream stream = File.Open(@"C:\Users\Desktop\ExcelDataReader.xlsx", FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader = ExcelReaderFactory.CreateOpenXmlReader(stream);
DataSet result = excelReader.AsDataSet();
excelReader.IsFirstRowAsColumnNames = true;
DataTable dt = result.Tables[0];
string text = dt.Rows[1][0].ToString();
One way to do it :
FileStream stream = File.Open(@"c:\working\test.xls", FileMode.Open, FileAccess.Read);
IExcelDataReader excelReader = ExcelReaderFactory.CreateBinaryReader(stream);
excelReader.IsFirstRowAsColumnNames = true;
DataSet result = excelReader.AsDataSet();
The result.Tables
contains the sheets and the result.tables[0].Rows
contains the cell rows.
You could use the .NET library to do the same thing which i believe is more straightforward.
string ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0; data source={path of your excel file}; Extended Properties=Excel 12.0;";
OleDbConnection objConn = null;
System.Data.DataTable dt = null;
//Create connection object by using the preceding connection string.
objConn = new OleDbConnection(connString);
objConn.Open();
//Get the data table containg the schema guid.
dt = objConn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string sql = string.Format("select * from [{0}$]", sheetName);
var adapter = new System.Data.OleDb.OleDbDataAdapter(sql, ConnectionString);
var ds = new System.Data.DataSet();
string tableName = sheetName;
adapter.Fill(ds, tableName);
System.Data.DataTable data = ds.Tables[tableName];
After you have your data in the datatable you can access them as you would normally do with a DataTable class.
public static DataTable ConvertExcelToDataTable(string filePath, bool isXlsx = false)
{
System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);
//open file and returns as Stream
using (var stream = File.Open(filePath, FileMode.Open, FileAccess.Read))
{
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
var conf = new ExcelDataSetConfiguration
{
ConfigureDataTable = _ => new ExcelDataTableConfiguration
{
UseHeaderRow = true
}
};
var dataSet = reader.AsDataSet(conf);
// Now you can get data from each sheet by its index or its "name"
var dataTable = dataSet.Tables[0];
Console.WriteLine("Total no of rows " + dataTable.Rows.Count);
Console.WriteLine("Total no of Columns " + dataTable.Columns.Count);
return dataTable;
}
}
}
Very easy with ExcelReaderFactory 3.1 and up:
using (var openFileDialog1 = new OpenFileDialog { Filter = "Excel Workbook|*.xls;*.xlsx;*.xlsm", ValidateNames = true })
{
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
var fs = File.Open(openFileDialog1.FileName, FileMode.Open, FileAccess.Read);
var reader = ExcelReaderFactory.CreateBinaryReader(fs);
var dataSet = reader.AsDataSet(new ExcelDataSetConfiguration
{
ConfigureDataTable = _ => new ExcelDataTableConfiguration
{
UseHeaderRow = true // Use first row is ColumnName here :D
}
});
if (dataSet.Tables.Count > 0)
{
var dtData = dataSet.Tables[0];
// Do Something
}
}
}