问题
I found NPOI is very good to write Excel files with C#.
But I want to open, read and modify Excel files in C#.
How can I do this?
回答1:
Simple read example below:
using NPOI.HSSF.UserModel;
using NPOI.SS.UserModel;
//.....
private void button1_Click(object sender, EventArgs e)
{
HSSFWorkbook hssfwb;
using (FileStream file = new FileStream(@"c:\test.xls", FileMode.Open, FileAccess.Read))
{
hssfwb= new HSSFWorkbook(file);
}
ISheet sheet = hssfwb.GetSheet("Arkusz1");
for (int row = 0; row <= sheet.LastRowNum; row++)
{
if (sheet.GetRow(row) != null) //null is when the row only contains empty cells
{
MessageBox.Show(string.Format("Row {0} = {1}", row, sheet.GetRow(row).GetCell(0).StringCellValue));
}
}
}
By the way: on NPOI website here in Download section there is example package - a pack of C# examples. Try it, if you haven't yet. :)
回答2:
It might be helpful to rely on the Workbook factory to instantiate the workbook object since the factory method will do the detection of xls or xlsx for you. Reference: http://apache-poi.1045710.n5.nabble.com/How-to-check-for-valid-excel-files-using-POI-without-checking-the-file-extension-td2341055.html
IWorkbook workbook = WorkbookFactory.Create(inputStream);
If you're not sure of the Sheet's name but you are sure of the index (0 based), you can grab the sheet like this:
ISheet sheet = workbook.GetSheetAt(sheetIndex);
You can then iterate through the rows using code supplied by the accepted answer from mj82
回答3:
I find NPOI very usefull for working with Excel Files, here is my implementation (Comments are in Spanish, sorry for that):
This Method Opens an Excel (both xls or xlsx) file and converts it into a DataTable.
/// <summary>Abre un archivo de Excel (xls o xlsx) y lo convierte en un DataTable.
/// LA PRIMERA FILA DEBE CONTENER LOS NOMBRES DE LOS CAMPOS.</summary>
/// <param name="pRutaArchivo">Ruta completa del archivo a abrir.</param>
/// <param name="pHojaIndex">Número (basado en cero) de la hoja que se desea abrir. 0 es la primera hoja.</param>
private DataTable Excel_To_DataTable(string pRutaArchivo, int pHojaIndex)
{
// --------------------------------- //
/* REFERENCIAS:
* NPOI.dll
* NPOI.OOXML.dll
* NPOI.OpenXml4Net.dll */
// --------------------------------- //
/* USING:
* using NPOI.SS.UserModel;
* using NPOI.HSSF.UserModel;
* using NPOI.XSSF.UserModel; */
// AUTOR: Ing. Jhollman Chacon R. 2015
// --------------------------------- //
DataTable Tabla = null;
try
{
if (System.IO.File.Exists(pRutaArchivo))
{
IWorkbook workbook = null; //IWorkbook determina si es xls o xlsx
ISheet worksheet = null;
string first_sheet_name = "";
using (FileStream FS = new FileStream(pRutaArchivo, FileMode.Open, FileAccess.Read))
{
workbook = WorkbookFactory.Create(FS); //Abre tanto XLS como XLSX
worksheet = workbook.GetSheetAt(pHojaIndex); //Obtener Hoja por indice
first_sheet_name = worksheet.SheetName; //Obtener el nombre de la Hoja
Tabla = new DataTable(first_sheet_name);
Tabla.Rows.Clear();
Tabla.Columns.Clear();
// Leer Fila por fila desde la primera
for (int rowIndex = 0; rowIndex <= worksheet.LastRowNum; rowIndex++)
{
DataRow NewReg = null;
IRow row = worksheet.GetRow(rowIndex);
IRow row2 = null;
IRow row3 = null;
if (rowIndex == 0)
{
row2 = worksheet.GetRow(rowIndex + 1); //Si es la Primera fila, obtengo tambien la segunda para saber el tipo de datos
row3 = worksheet.GetRow(rowIndex + 2); //Y la tercera tambien por las dudas
}
if (row != null) //null is when the row only contains empty cells
{
if (rowIndex > 0) NewReg = Tabla.NewRow();
int colIndex = 0;
//Leer cada Columna de la fila
foreach (ICell cell in row.Cells)
{
object valorCell = null;
string cellType = "";
string[] cellType2 = new string[2];
if (rowIndex == 0) //Asumo que la primera fila contiene los titlos:
{
for (int i = 0; i < 2; i++)
{
ICell cell2 = null;
if (i == 0) { cell2 = row2.GetCell(cell.ColumnIndex); }
else { cell2 = row3.GetCell(cell.ColumnIndex); }
if (cell2 != null)
{
switch (cell2.CellType)
{
case CellType.Blank: break;
case CellType.Boolean: cellType2[i] = "System.Boolean"; break;
case CellType.String: cellType2[i] = "System.String"; break;
case CellType.Numeric:
if (HSSFDateUtil.IsCellDateFormatted(cell2)) { cellType2[i] = "System.DateTime"; }
else
{
cellType2[i] = "System.Double"; //valorCell = cell2.NumericCellValue;
}
break;
case CellType.Formula:
bool continuar = true;
switch (cell2.CachedFormulaResultType)
{
case CellType.Boolean: cellType2[i] = "System.Boolean"; break;
case CellType.String: cellType2[i] = "System.String"; break;
case CellType.Numeric:
if (HSSFDateUtil.IsCellDateFormatted(cell2)) { cellType2[i] = "System.DateTime"; }
else
{
try
{
//DETERMINAR SI ES BOOLEANO
if (cell2.CellFormula == "TRUE()") { cellType2[i] = "System.Boolean"; continuar = false; }
if (continuar && cell2.CellFormula == "FALSE()") { cellType2[i] = "System.Boolean"; continuar = false; }
if (continuar) { cellType2[i] = "System.Double"; continuar = false; }
}
catch { }
} break;
}
break;
default:
cellType2[i] = "System.String"; break;
}
}
}
//Resolver las diferencias de Tipos
if (cellType2[0] == cellType2[1]) { cellType = cellType2[0]; }
else
{
if (cellType2[0] == null) cellType = cellType2[1];
if (cellType2[1] == null) cellType = cellType2[0];
if (cellType == "") cellType = "System.String";
}
//Obtener el nombre de la Columna
string colName = "Column_{0}";
try { colName = cell.StringCellValue; }
catch { colName = string.Format(colName, colIndex); }
//Verificar que NO se repita el Nombre de la Columna
foreach (DataColumn col in Tabla.Columns)
{
if (col.ColumnName == colName) colName = string.Format("{0}_{1}", colName, colIndex);
}
//Agregar el campos de la tabla:
DataColumn codigo = new DataColumn(colName, System.Type.GetType(cellType));
Tabla.Columns.Add(codigo); colIndex++;
}
else
{
//Las demas filas son registros:
switch (cell.CellType)
{
case CellType.Blank: valorCell = DBNull.Value; break;
case CellType.Boolean: valorCell = cell.BooleanCellValue; break;
case CellType.String: valorCell = cell.StringCellValue; break;
case CellType.Numeric:
if (HSSFDateUtil.IsCellDateFormatted(cell)) { valorCell = cell.DateCellValue; }
else { valorCell = cell.NumericCellValue; } break;
case CellType.Formula:
switch (cell.CachedFormulaResultType)
{
case CellType.Blank: valorCell = DBNull.Value; break;
case CellType.String: valorCell = cell.StringCellValue; break;
case CellType.Boolean: valorCell = cell.BooleanCellValue; break;
case CellType.Numeric:
if (HSSFDateUtil.IsCellDateFormatted(cell)) { valorCell = cell.DateCellValue; }
else { valorCell = cell.NumericCellValue; }
break;
}
break;
default: valorCell = cell.StringCellValue; break;
}
//Agregar el nuevo Registro
if (cell.ColumnIndex <= Tabla.Columns.Count - 1) NewReg[cell.ColumnIndex] = valorCell;
}
}
}
if (rowIndex > 0) Tabla.Rows.Add(NewReg);
}
Tabla.AcceptChanges();
}
}
else
{
throw new Exception("ERROR 404: El archivo especificado NO existe.");
}
}
catch (Exception ex)
{
throw ex;
}
return Tabla;
}
This Second method does the oposite, saves a DataTable into an Excel File, yeah it can either be xls or the new xlsx, your choise!
/// <summary>Convierte un DataTable en un archivo de Excel (xls o Xlsx) y lo guarda en disco.</summary>
/// <param name="pDatos">Datos de la Tabla a guardar. Usa el nombre de la tabla como nombre de la Hoja</param>
/// <param name="pFilePath">Ruta del archivo donde se guarda.</param>
private void DataTable_To_Excel(DataTable pDatos, string pFilePath)
{
try
{
if (pDatos != null && pDatos.Rows.Count > 0)
{
IWorkbook workbook = null;
ISheet worksheet = null;
using (FileStream stream = new FileStream(pFilePath, FileMode.Create, FileAccess.ReadWrite))
{
string Ext = System.IO.Path.GetExtension(pFilePath); //<-Extension del archivo
switch (Ext.ToLower())
{
case ".xls":
HSSFWorkbook workbookH = new HSSFWorkbook();
NPOI.HPSF.DocumentSummaryInformation dsi = NPOI.HPSF.PropertySetFactory.CreateDocumentSummaryInformation();
dsi.Company = "Cutcsa"; dsi.Manager = "Departamento Informatico";
workbookH.DocumentSummaryInformation = dsi;
workbook = workbookH;
break;
case ".xlsx": workbook = new XSSFWorkbook(); break;
}
worksheet = workbook.CreateSheet(pDatos.TableName); //<-Usa el nombre de la tabla como nombre de la Hoja
//CREAR EN LA PRIMERA FILA LOS TITULOS DE LAS COLUMNAS
int iRow = 0;
if (pDatos.Columns.Count > 0)
{
int iCol = 0;
IRow fila = worksheet.CreateRow(iRow);
foreach (DataColumn columna in pDatos.Columns)
{
ICell cell = fila.CreateCell(iCol, CellType.String);
cell.SetCellValue(columna.ColumnName);
iCol++;
}
iRow++;
}
//FORMATOS PARA CIERTOS TIPOS DE DATOS
ICellStyle _doubleCellStyle = workbook.CreateCellStyle();
_doubleCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("#,##0.###");
ICellStyle _intCellStyle = workbook.CreateCellStyle();
_intCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("#,##0");
ICellStyle _boolCellStyle = workbook.CreateCellStyle();
_boolCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("BOOLEAN");
ICellStyle _dateCellStyle = workbook.CreateCellStyle();
_dateCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("dd-MM-yyyy");
ICellStyle _dateTimeCellStyle = workbook.CreateCellStyle();
_dateTimeCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("dd-MM-yyyy HH:mm:ss");
//AHORA CREAR UNA FILA POR CADA REGISTRO DE LA TABLA
foreach (DataRow row in pDatos.Rows)
{
IRow fila = worksheet.CreateRow(iRow);
int iCol = 0;
foreach (DataColumn column in pDatos.Columns)
{
ICell cell = null; //<-Representa la celda actual
object cellValue = row[iCol]; //<- El valor actual de la celda
switch (column.DataType.ToString())
{
case "System.Boolean":
if (cellValue != DBNull.Value)
{
cell = fila.CreateCell(iCol, CellType.Boolean);
if (Convert.ToBoolean(cellValue)) { cell.SetCellFormula("TRUE()"); }
else { cell.SetCellFormula("FALSE()"); }
cell.CellStyle = _boolCellStyle;
}
break;
case "System.String":
if (cellValue != DBNull.Value)
{
cell = fila.CreateCell(iCol, CellType.String);
cell.SetCellValue(Convert.ToString(cellValue));
}
break;
case "System.Int32":
if (cellValue != DBNull.Value)
{
cell = fila.CreateCell(iCol, CellType.Numeric);
cell.SetCellValue(Convert.ToInt32(cellValue));
cell.CellStyle = _intCellStyle;
}
break;
case "System.Int64":
if (cellValue != DBNull.Value)
{
cell = fila.CreateCell(iCol, CellType.Numeric);
cell.SetCellValue(Convert.ToInt64(cellValue));
cell.CellStyle = _intCellStyle;
}
break;
case "System.Decimal":
if (cellValue != DBNull.Value)
{
cell = fila.CreateCell(iCol, CellType.Numeric);
cell.SetCellValue(Convert.ToDouble(cellValue));
cell.CellStyle = _doubleCellStyle;
}
break;
case "System.Double":
if (cellValue != DBNull.Value)
{
cell = fila.CreateCell(iCol, CellType.Numeric);
cell.SetCellValue(Convert.ToDouble(cellValue));
cell.CellStyle = _doubleCellStyle;
}
break;
case "System.DateTime":
if (cellValue != DBNull.Value)
{
cell = fila.CreateCell(iCol, CellType.Numeric);
cell.SetCellValue(Convert.ToDateTime(cellValue));
//Si No tiene valor de Hora, usar formato dd-MM-yyyy
DateTime cDate = Convert.ToDateTime(cellValue);
if (cDate != null && cDate.Hour > 0) { cell.CellStyle = _dateTimeCellStyle; }
else { cell.CellStyle = _dateCellStyle; }
}
break;
default:
break;
}
iCol++;
}
iRow++;
}
workbook.Write(stream);
stream.Close();
}
}
}
catch (Exception ex)
{
throw ex;
}
}
With this 2 methods you can Open an Excel file, load it into a DataTable, do your modifications and save it back into an Excel file.
Hope you guys find this usefull.
回答4:
private DataTable GetDataTableFromExcel(String Path)
{
XSSFWorkbook wb;
XSSFSheet sh;
String Sheet_name;
using (var fs = new FileStream(Path, FileMode.Open, FileAccess.Read))
{
wb = new XSSFWorkbook(fs);
Sheet_name= wb.GetSheetAt(0).SheetName; //get first sheet name
}
DataTable DT = new DataTable();
DT.Rows.Clear();
DT.Columns.Clear();
// get sheet
sh = (XSSFSheet)wb.GetSheet(Sheet_name);
int i = 0;
while (sh.GetRow(i) != null)
{
// add neccessary columns
if (DT.Columns.Count < sh.GetRow(i).Cells.Count)
{
for (int j = 0; j < sh.GetRow(i).Cells.Count; j++)
{
DT.Columns.Add("", typeof(string));
}
}
// add row
DT.Rows.Add();
// write row value
for (int j = 0; j < sh.GetRow(i).Cells.Count; j++)
{
var cell = sh.GetRow(i).GetCell(j);
if (cell != null)
{
// TODO: you can add more cell types capatibility, e. g. formula
switch (cell.CellType)
{
case NPOI.SS.UserModel.CellType.Numeric:
DT.Rows[i][j] = sh.GetRow(i).GetCell(j).NumericCellValue;
//dataGridView1[j, i].Value = sh.GetRow(i).GetCell(j).NumericCellValue;
break;
case NPOI.SS.UserModel.CellType.String:
DT.Rows[i][j] = sh.GetRow(i).GetCell(j).StringCellValue;
break;
}
}
}
i++;
}
return DT;
}
回答5:
private static ISheet GetFileStream(string fullFilePath)
{
var fileExtension = Path.GetExtension(fullFilePath);
string sheetName;
ISheet sheet = null;
switch (fileExtension)
{
case ".xlsx":
using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read))
{
var wb = new XSSFWorkbook(fs);
sheetName = wb.GetSheetAt(0).SheetName;
sheet = (XSSFSheet) wb.GetSheet(sheetName);
}
break;
case ".xls":
using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read))
{
var wb = new HSSFWorkbook(fs);
sheetName = wb.GetSheetAt(0).SheetName;
sheet = (HSSFSheet) wb.GetSheet(sheetName);
}
break;
}
return sheet;
}
private static DataTable GetRequestsDataFromExcel(string fullFilePath)
{
try
{
var sh = GetFileStream(fullFilePath);
var dtExcelTable = new DataTable();
dtExcelTable.Rows.Clear();
dtExcelTable.Columns.Clear();
var headerRow = sh.GetRow(0);
int colCount = headerRow.LastCellNum;
for (var c = 0; c < colCount; c++)
dtExcelTable.Columns.Add(headerRow.GetCell(c).ToString());
var i = 1;
var currentRow = sh.GetRow(i);
while (currentRow != null)
{
var dr = dtExcelTable.NewRow();
for (var j = 0; j < currentRow.Cells.Count; j++)
{
var cell = currentRow.GetCell(j);
if (cell != null)
switch (cell.CellType)
{
case CellType.Numeric:
dr[j] = DateUtil.IsCellDateFormatted(cell)
? cell.DateCellValue.ToString(CultureInfo.InvariantCulture)
: cell.NumericCellValue.ToString(CultureInfo.InvariantCulture);
break;
case CellType.String:
dr[j] = cell.StringCellValue;
break;
case CellType.Blank:
dr[j] = string.Empty;
break;
}
}
dtExcelTable.Rows.Add(dr);
i++;
currentRow = sh.GetRow(i);
}
return dtExcelTable;
}
catch (Exception e)
{
throw;
}
}
回答6:
As Janoulle pointed out, you don't need to detect which extension it is if you use the WorkbookFactory, it will do it for you. I recently had to implement a solution using NPOI to read Excel files and import email addresses into a sql database. My main problem was that I was probably going to receive about 12 different Excel layouts from different customers so I needed something that could be changed quickly without much code. I ended up using Npoi.Mapper which is an awesome tool! Highly recommended!
Here is my complete solution:
using System.IO;
using System.Linq;
using Npoi.Mapper;
using Npoi.Mapper.Attributes;
using NPOI.SS.UserModel;
namespace JobCustomerImport.Processors
{
public class ExcelEmailProcessor
{
private UserManagementServiceContext DataContext { get; }
public ExcelEmailProcessor(int customerNumber)
{
DataContext = new UserManagementServiceContext();
}
public void Execute(string localPath, int sheetIndex)
{
IWorkbook workbook;
using (FileStream file = new FileStream(localPath, FileMode.Open, FileAccess.Read))
{
workbook = WorkbookFactory.Create(file);
}
var importer = new Mapper(workbook);
var items = importer.Take<MurphyExcelFormat>(sheetIndex);
foreach(var item in items)
{
var row = item.Value;
if (string.IsNullOrEmpty(row.EmailAddress))
continue;
UpdateUser(row);
}
DataContext.SaveChanges();
}
private void UpdateUser(MurphyExcelFormat row)
{
//LOGIC HERE TO UPDATE A USER IN DATABASE...
}
private class MurphyExcelFormat
{
[Column("District")]
public int District { get; set; }
[Column("DM")]
public string FullName { get; set; }
[Column("Email Address")]
public string EmailAddress { get; set; }
[Column(3)]
public string Username { get; set; }
public string FirstName
{
get
{
return Username.Split('.')[0];
}
}
public string LastName
{
get
{
return Username.Split('.')[1];
}
}
}
}
}
I am so happy with NPOI + Npoi.Mapper (from Donny Tian) as an Excel import solution that I wrote a blog post about it, going in to more detail about this code above. You can read it here if you wish: Easiest way to import excel files. The best thing about this solution is that it runs perfectly in a serverless azure/cloud environment which I couldn't get with other Excel tools/libraries.
回答7:
Since you've asked to read and modify the xls file I have changed @mj82's answer to correspond your needs.
HSSFWorkbook
does not have Save
method, but it does have Write
to a stream.
static void Main(string[] args)
{
string filepath = @"C:\test.xls";
HSSFWorkbook hssfwb;
using (FileStream file = new FileStream(filepath, FileMode.Open, FileAccess.Read))
{
hssfwb = new HSSFWorkbook(file);
}
ISheet sheet = hssfwb.GetSheetAt(0);
for (int row = 0; row <= sheet.LastRowNum; row++)
{
if (sheet.GetRow(row) != null) //null is when the row only contains empty cells
{
// Set new cell value
sheet.GetRow(row).GetCell(0).SetCellValue("foo");
Console.WriteLine("Row {0} = {1}", row, sheet.GetRow(row).GetCell(0).StringCellValue);
}
}
// Save the file
using (FileStream file = new FileStream(filepath, FileMode.Open, FileAccess.Write))
{
hssfwb.Write(file);
}
Console.ReadLine();
}
来源:https://stackoverflow.com/questions/5855813/how-to-read-file-using-npoi