How to read file using NPOI

天涯浪子 提交于 2019-11-26 06:27:21

问题


I found NPOI is very good to write Excel files with C#.

But I want to open, read and modify Excel files in C#.

How can I do this?


回答1:


Simple read example below:

using NPOI.HSSF.UserModel;
using NPOI.SS.UserModel;

//.....

private void button1_Click(object sender, EventArgs e)
{
    HSSFWorkbook hssfwb;
    using (FileStream file = new FileStream(@"c:\test.xls", FileMode.Open, FileAccess.Read))
    {
        hssfwb= new HSSFWorkbook(file);
    }

    ISheet sheet = hssfwb.GetSheet("Arkusz1");
    for (int row = 0; row <= sheet.LastRowNum; row++)
    {
        if (sheet.GetRow(row) != null) //null is when the row only contains empty cells 
        {
            MessageBox.Show(string.Format("Row {0} = {1}", row, sheet.GetRow(row).GetCell(0).StringCellValue));
        }
    }
}  

By the way: on NPOI website here in Download section there is example package - a pack of C# examples. Try it, if you haven't yet. :)




回答2:


It might be helpful to rely on the Workbook factory to instantiate the workbook object since the factory method will do the detection of xls or xlsx for you. Reference: http://apache-poi.1045710.n5.nabble.com/How-to-check-for-valid-excel-files-using-POI-without-checking-the-file-extension-td2341055.html

IWorkbook workbook = WorkbookFactory.Create(inputStream);

If you're not sure of the Sheet's name but you are sure of the index (0 based), you can grab the sheet like this:

ISheet sheet = workbook.GetSheetAt(sheetIndex);

You can then iterate through the rows using code supplied by the accepted answer from mj82




回答3:


I find NPOI very usefull for working with Excel Files, here is my implementation (Comments are in Spanish, sorry for that):

This Method Opens an Excel (both xls or xlsx) file and converts it into a DataTable.

        /// <summary>Abre un archivo de Excel (xls o xlsx) y lo convierte en un DataTable.
    /// LA PRIMERA FILA DEBE CONTENER LOS NOMBRES DE LOS CAMPOS.</summary>
    /// <param name="pRutaArchivo">Ruta completa del archivo a abrir.</param>
    /// <param name="pHojaIndex">Número (basado en cero) de la hoja que se desea abrir. 0 es la primera hoja.</param>
    private DataTable Excel_To_DataTable(string pRutaArchivo, int pHojaIndex)
    {
        // --------------------------------- //
        /* REFERENCIAS:
         * NPOI.dll
         * NPOI.OOXML.dll
         * NPOI.OpenXml4Net.dll */
        // --------------------------------- //
        /* USING:
         * using NPOI.SS.UserModel;
         * using NPOI.HSSF.UserModel;
         * using NPOI.XSSF.UserModel; */
        // AUTOR: Ing. Jhollman Chacon R. 2015
        // --------------------------------- //
        DataTable Tabla = null;
        try
        {
            if (System.IO.File.Exists(pRutaArchivo))
            {

                IWorkbook workbook = null;  //IWorkbook determina si es xls o xlsx              
                ISheet worksheet = null;
                string first_sheet_name = "";

                using (FileStream FS = new FileStream(pRutaArchivo, FileMode.Open, FileAccess.Read))
                {
                    workbook = WorkbookFactory.Create(FS);          //Abre tanto XLS como XLSX
                    worksheet = workbook.GetSheetAt(pHojaIndex);    //Obtener Hoja por indice
                    first_sheet_name = worksheet.SheetName;         //Obtener el nombre de la Hoja

                    Tabla = new DataTable(first_sheet_name);
                    Tabla.Rows.Clear();
                    Tabla.Columns.Clear();

                    // Leer Fila por fila desde la primera
                    for (int rowIndex = 0; rowIndex <= worksheet.LastRowNum; rowIndex++)
                    {
                        DataRow NewReg = null;
                        IRow row = worksheet.GetRow(rowIndex);
                        IRow row2 = null;
                        IRow row3 = null;

                        if (rowIndex == 0)
                        {
                            row2 = worksheet.GetRow(rowIndex + 1); //Si es la Primera fila, obtengo tambien la segunda para saber el tipo de datos
                            row3 = worksheet.GetRow(rowIndex + 2); //Y la tercera tambien por las dudas
                        }

                        if (row != null) //null is when the row only contains empty cells 
                        {
                            if (rowIndex > 0) NewReg = Tabla.NewRow();

                            int colIndex = 0;
                            //Leer cada Columna de la fila
                            foreach (ICell cell in row.Cells)
                            {
                                object valorCell = null;
                                string cellType = "";
                                string[] cellType2 = new string[2];

                                if (rowIndex == 0) //Asumo que la primera fila contiene los titlos:
                                {
                                    for (int i = 0; i < 2; i++)
                                    {
                                        ICell cell2 = null;
                                        if (i == 0) { cell2 = row2.GetCell(cell.ColumnIndex); }
                                        else { cell2 = row3.GetCell(cell.ColumnIndex); }

                                        if (cell2 != null)
                                        {
                                            switch (cell2.CellType)
                                            {
                                                case CellType.Blank: break;
                                                case CellType.Boolean: cellType2[i] = "System.Boolean"; break;
                                                case CellType.String: cellType2[i] = "System.String"; break;
                                                case CellType.Numeric:
                                                    if (HSSFDateUtil.IsCellDateFormatted(cell2)) { cellType2[i] = "System.DateTime"; }
                                                    else
                                                    {
                                                        cellType2[i] = "System.Double";  //valorCell = cell2.NumericCellValue;
                                                    }
                                                    break;

                                                case CellType.Formula:                                                      
                                                    bool continuar = true;
                                                    switch (cell2.CachedFormulaResultType)
                                                    {
                                                        case CellType.Boolean: cellType2[i] = "System.Boolean"; break;
                                                        case CellType.String: cellType2[i] = "System.String"; break;
                                                        case CellType.Numeric:
                                                            if (HSSFDateUtil.IsCellDateFormatted(cell2)) { cellType2[i] = "System.DateTime"; }
                                                            else
                                                            {
                                                                try
                                                                {
                                                                    //DETERMINAR SI ES BOOLEANO
                                                                    if (cell2.CellFormula == "TRUE()") { cellType2[i] = "System.Boolean"; continuar = false; }
                                                                    if (continuar && cell2.CellFormula == "FALSE()") { cellType2[i] = "System.Boolean"; continuar = false; }
                                                                    if (continuar) { cellType2[i] = "System.Double"; continuar = false; }
                                                                }
                                                                catch { }
                                                            } break;
                                                    }
                                                    break;
                                                default:
                                                    cellType2[i] = "System.String"; break;
                                            }
                                        }
                                    }

                                    //Resolver las diferencias de Tipos
                                    if (cellType2[0] == cellType2[1]) { cellType = cellType2[0]; }
                                    else
                                    {
                                        if (cellType2[0] == null) cellType = cellType2[1];
                                        if (cellType2[1] == null) cellType = cellType2[0];
                                        if (cellType == "") cellType = "System.String";
                                    }

                                    //Obtener el nombre de la Columna
                                    string colName = "Column_{0}";
                                    try { colName = cell.StringCellValue; }
                                    catch { colName = string.Format(colName, colIndex); }

                                    //Verificar que NO se repita el Nombre de la Columna
                                    foreach (DataColumn col in Tabla.Columns)
                                    {
                                        if (col.ColumnName == colName) colName = string.Format("{0}_{1}", colName, colIndex);
                                    }

                                    //Agregar el campos de la tabla:
                                    DataColumn codigo = new DataColumn(colName, System.Type.GetType(cellType));
                                    Tabla.Columns.Add(codigo); colIndex++;
                                }
                                else
                                {
                                    //Las demas filas son registros:
                                    switch (cell.CellType)
                                    {
                                        case CellType.Blank: valorCell = DBNull.Value; break;
                                        case CellType.Boolean: valorCell = cell.BooleanCellValue; break;
                                        case CellType.String: valorCell = cell.StringCellValue; break;
                                        case CellType.Numeric:
                                            if (HSSFDateUtil.IsCellDateFormatted(cell)) { valorCell = cell.DateCellValue; }
                                            else { valorCell = cell.NumericCellValue; } break;
                                        case CellType.Formula:
                                            switch (cell.CachedFormulaResultType)
                                            {
                                                case CellType.Blank: valorCell = DBNull.Value; break;
                                                case CellType.String: valorCell = cell.StringCellValue; break;
                                                case CellType.Boolean: valorCell = cell.BooleanCellValue; break;
                                                case CellType.Numeric:
                                                    if (HSSFDateUtil.IsCellDateFormatted(cell)) { valorCell = cell.DateCellValue; }
                                                    else { valorCell = cell.NumericCellValue; }
                                                    break;
                                            }
                                            break;
                                        default: valorCell = cell.StringCellValue; break;
                                    }
                                    //Agregar el nuevo Registro
                                    if (cell.ColumnIndex <= Tabla.Columns.Count - 1) NewReg[cell.ColumnIndex] = valorCell;
                                }
                            }
                        }
                        if (rowIndex > 0) Tabla.Rows.Add(NewReg);
                    }
                    Tabla.AcceptChanges();
                }
            }
            else
            {
                throw new Exception("ERROR 404: El archivo especificado NO existe.");
            }
        }
        catch (Exception ex)
        {
            throw ex;
        }
        return Tabla;
    }

This Second method does the oposite, saves a DataTable into an Excel File, yeah it can either be xls or the new xlsx, your choise!

        /// <summary>Convierte un DataTable en un archivo de Excel (xls o Xlsx) y lo guarda en disco.</summary>
    /// <param name="pDatos">Datos de la Tabla a guardar. Usa el nombre de la tabla como nombre de la Hoja</param>
    /// <param name="pFilePath">Ruta del archivo donde se guarda.</param>
    private void DataTable_To_Excel(DataTable pDatos, string pFilePath)
    {
        try
        {
            if (pDatos != null && pDatos.Rows.Count > 0)
            {
                IWorkbook workbook = null;
                ISheet worksheet = null;

                using (FileStream stream = new FileStream(pFilePath, FileMode.Create, FileAccess.ReadWrite))
                {
                    string Ext = System.IO.Path.GetExtension(pFilePath); //<-Extension del archivo
                    switch (Ext.ToLower())
                    {
                        case ".xls":
                            HSSFWorkbook workbookH = new HSSFWorkbook();
                            NPOI.HPSF.DocumentSummaryInformation dsi = NPOI.HPSF.PropertySetFactory.CreateDocumentSummaryInformation();
                            dsi.Company = "Cutcsa"; dsi.Manager = "Departamento Informatico";
                            workbookH.DocumentSummaryInformation = dsi;
                            workbook = workbookH;
                            break;

                        case ".xlsx": workbook = new XSSFWorkbook(); break;
                    }

                    worksheet = workbook.CreateSheet(pDatos.TableName); //<-Usa el nombre de la tabla como nombre de la Hoja

                    //CREAR EN LA PRIMERA FILA LOS TITULOS DE LAS COLUMNAS
                    int iRow = 0;
                    if (pDatos.Columns.Count > 0)
                    {
                        int iCol = 0;
                        IRow fila = worksheet.CreateRow(iRow);
                        foreach (DataColumn columna in pDatos.Columns)
                        {
                            ICell cell = fila.CreateCell(iCol, CellType.String);
                            cell.SetCellValue(columna.ColumnName);
                            iCol++;
                        }
                        iRow++;
                    }

                    //FORMATOS PARA CIERTOS TIPOS DE DATOS
                    ICellStyle _doubleCellStyle = workbook.CreateCellStyle();
                    _doubleCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("#,##0.###");

                    ICellStyle _intCellStyle = workbook.CreateCellStyle();
                    _intCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("#,##0");

                    ICellStyle _boolCellStyle = workbook.CreateCellStyle();
                    _boolCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("BOOLEAN");

                    ICellStyle _dateCellStyle = workbook.CreateCellStyle();
                    _dateCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("dd-MM-yyyy");

                    ICellStyle _dateTimeCellStyle = workbook.CreateCellStyle();
                    _dateTimeCellStyle.DataFormat = workbook.CreateDataFormat().GetFormat("dd-MM-yyyy HH:mm:ss");

                    //AHORA CREAR UNA FILA POR CADA REGISTRO DE LA TABLA
                    foreach (DataRow row in pDatos.Rows)
                    {
                        IRow fila = worksheet.CreateRow(iRow);
                        int iCol = 0;
                        foreach (DataColumn column in pDatos.Columns)
                        {
                            ICell cell = null; //<-Representa la celda actual                               
                            object cellValue = row[iCol]; //<- El valor actual de la celda

                            switch (column.DataType.ToString())
                            {
                                case "System.Boolean":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Boolean);

                                        if (Convert.ToBoolean(cellValue)) { cell.SetCellFormula("TRUE()"); }
                                        else { cell.SetCellFormula("FALSE()"); }

                                        cell.CellStyle = _boolCellStyle;
                                    }
                                    break;

                                case "System.String":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.String);
                                        cell.SetCellValue(Convert.ToString(cellValue));
                                    }
                                    break;

                                case "System.Int32":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToInt32(cellValue));
                                        cell.CellStyle = _intCellStyle;
                                    }
                                    break;
                                case "System.Int64":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToInt64(cellValue));
                                        cell.CellStyle = _intCellStyle;
                                    }
                                    break;
                                case "System.Decimal":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToDouble(cellValue));
                                        cell.CellStyle = _doubleCellStyle;
                                    }
                                    break;
                                case "System.Double":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToDouble(cellValue));
                                        cell.CellStyle = _doubleCellStyle;
                                    }
                                    break;

                                case "System.DateTime":
                                    if (cellValue != DBNull.Value)
                                    {
                                        cell = fila.CreateCell(iCol, CellType.Numeric);
                                        cell.SetCellValue(Convert.ToDateTime(cellValue));

                                        //Si No tiene valor de Hora, usar formato dd-MM-yyyy
                                        DateTime cDate = Convert.ToDateTime(cellValue);
                                        if (cDate != null && cDate.Hour > 0) { cell.CellStyle = _dateTimeCellStyle; }
                                        else { cell.CellStyle = _dateCellStyle; }
                                    }
                                    break;
                                default:
                                    break;
                            }
                            iCol++;
                        }
                        iRow++;
                    }

                    workbook.Write(stream);
                    stream.Close();
                }
            }
        }
        catch (Exception ex)
        {
            throw ex;
        }
    }

With this 2 methods you can Open an Excel file, load it into a DataTable, do your modifications and save it back into an Excel file.

Hope you guys find this usefull.




回答4:


private DataTable GetDataTableFromExcel(String Path)
    {
        XSSFWorkbook wb;
        XSSFSheet sh;
        String Sheet_name;

        using (var fs = new FileStream(Path, FileMode.Open, FileAccess.Read))
        {
            wb = new XSSFWorkbook(fs);

               Sheet_name= wb.GetSheetAt(0).SheetName;  //get first sheet name
        }
        DataTable DT = new DataTable();
        DT.Rows.Clear();
        DT.Columns.Clear();

        // get sheet
        sh = (XSSFSheet)wb.GetSheet(Sheet_name);

        int i = 0;
        while (sh.GetRow(i) != null)
        {
            // add neccessary columns
            if (DT.Columns.Count < sh.GetRow(i).Cells.Count)
            {
                for (int j = 0; j < sh.GetRow(i).Cells.Count; j++)
                {
                    DT.Columns.Add("", typeof(string));
                }
            }

            // add row
            DT.Rows.Add();

            // write row value
            for (int j = 0; j < sh.GetRow(i).Cells.Count; j++)
            {
                var cell = sh.GetRow(i).GetCell(j);

                if (cell != null)
                {
                    // TODO: you can add more cell types capatibility, e. g. formula
                    switch (cell.CellType)
                    {
                        case NPOI.SS.UserModel.CellType.Numeric:
                            DT.Rows[i][j] = sh.GetRow(i).GetCell(j).NumericCellValue;
                            //dataGridView1[j, i].Value = sh.GetRow(i).GetCell(j).NumericCellValue;

                            break;
                        case NPOI.SS.UserModel.CellType.String:
                            DT.Rows[i][j] = sh.GetRow(i).GetCell(j).StringCellValue;

                            break;
                    }
                }
            }

            i++;
        }

        return DT;
    }



回答5:


private static ISheet GetFileStream(string fullFilePath)
    {
        var fileExtension = Path.GetExtension(fullFilePath);
        string sheetName;
        ISheet sheet = null;
        switch (fileExtension)
        {
            case ".xlsx":
                using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read))
                {
                    var wb = new XSSFWorkbook(fs);
                    sheetName = wb.GetSheetAt(0).SheetName;
                    sheet = (XSSFSheet) wb.GetSheet(sheetName);
                }
                break;
            case ".xls":
                using (var fs = new FileStream(fullFilePath, FileMode.Open, FileAccess.Read))
                {
                    var wb = new HSSFWorkbook(fs);
                    sheetName = wb.GetSheetAt(0).SheetName;
                    sheet = (HSSFSheet) wb.GetSheet(sheetName);
                }
                break;
        }
        return sheet;
    }

    private static DataTable GetRequestsDataFromExcel(string fullFilePath)
    {
        try
        {
            var sh = GetFileStream(fullFilePath);
            var dtExcelTable = new DataTable();
            dtExcelTable.Rows.Clear();
            dtExcelTable.Columns.Clear();
            var headerRow = sh.GetRow(0);
            int colCount = headerRow.LastCellNum;
            for (var c = 0; c < colCount; c++)
                dtExcelTable.Columns.Add(headerRow.GetCell(c).ToString());
            var i = 1;
            var currentRow = sh.GetRow(i);
            while (currentRow != null)
            {
                var dr = dtExcelTable.NewRow();
                for (var j = 0; j < currentRow.Cells.Count; j++)
                {
                    var cell = currentRow.GetCell(j);

                    if (cell != null)
                        switch (cell.CellType)
                        {
                            case CellType.Numeric:
                                dr[j] = DateUtil.IsCellDateFormatted(cell)
                                    ? cell.DateCellValue.ToString(CultureInfo.InvariantCulture)
                                    : cell.NumericCellValue.ToString(CultureInfo.InvariantCulture);
                                break;
                            case CellType.String:
                                dr[j] = cell.StringCellValue;
                                break;
                            case CellType.Blank:
                                dr[j] = string.Empty;
                                break;
                        }
                }
                dtExcelTable.Rows.Add(dr);
                i++;
                currentRow = sh.GetRow(i);
            }
            return dtExcelTable;
        }
        catch (Exception e)
        {
            throw;
        }
    }



回答6:


As Janoulle pointed out, you don't need to detect which extension it is if you use the WorkbookFactory, it will do it for you. I recently had to implement a solution using NPOI to read Excel files and import email addresses into a sql database. My main problem was that I was probably going to receive about 12 different Excel layouts from different customers so I needed something that could be changed quickly without much code. I ended up using Npoi.Mapper which is an awesome tool! Highly recommended!

Here is my complete solution:

using System.IO;
using System.Linq;
using Npoi.Mapper;
using Npoi.Mapper.Attributes;
using NPOI.SS.UserModel;

namespace JobCustomerImport.Processors
{
    public class ExcelEmailProcessor
    {
        private UserManagementServiceContext DataContext { get; }

        public ExcelEmailProcessor(int customerNumber)
        {
            DataContext = new UserManagementServiceContext();
        }

        public void Execute(string localPath, int sheetIndex)
        {
            IWorkbook workbook;
            using (FileStream file = new FileStream(localPath, FileMode.Open, FileAccess.Read))
            {
                workbook = WorkbookFactory.Create(file);
            }

            var importer = new Mapper(workbook);
            var items = importer.Take<MurphyExcelFormat>(sheetIndex);
            foreach(var item in items)
            {
                var row = item.Value;
                if (string.IsNullOrEmpty(row.EmailAddress))
                    continue;

                UpdateUser(row);
            }

            DataContext.SaveChanges();
        }

        private void UpdateUser(MurphyExcelFormat row)
        {
            //LOGIC HERE TO UPDATE A USER IN DATABASE...
        }

        private class MurphyExcelFormat
        {
            [Column("District")]
            public int District { get; set; }

            [Column("DM")]
            public string FullName { get; set; }

            [Column("Email Address")]
            public string EmailAddress { get; set; }

            [Column(3)]
            public string Username { get; set; }

            public string FirstName
            {
                get
                {
                    return Username.Split('.')[0];
                }
            }

            public string LastName
            {
                get
                {
                    return Username.Split('.')[1];
                }
            }
        }
    }
}

I am so happy with NPOI + Npoi.Mapper (from Donny Tian) as an Excel import solution that I wrote a blog post about it, going in to more detail about this code above. You can read it here if you wish: Easiest way to import excel files. The best thing about this solution is that it runs perfectly in a serverless azure/cloud environment which I couldn't get with other Excel tools/libraries.




回答7:


Since you've asked to read and modify the xls file I have changed @mj82's answer to correspond your needs.

HSSFWorkbook does not have Save method, but it does have Write to a stream.

static void Main(string[] args)
        {
            string filepath = @"C:\test.xls";

            HSSFWorkbook hssfwb;

            using (FileStream file = new FileStream(filepath, FileMode.Open, FileAccess.Read))
            {
                hssfwb = new HSSFWorkbook(file);
            }

            ISheet sheet = hssfwb.GetSheetAt(0);

            for (int row = 0; row <= sheet.LastRowNum; row++)
            {
                if (sheet.GetRow(row) != null) //null is when the row only contains empty cells 
                {
                    // Set new cell value
                    sheet.GetRow(row).GetCell(0).SetCellValue("foo");
                    Console.WriteLine("Row {0} = {1}", row, sheet.GetRow(row).GetCell(0).StringCellValue);
                }
            }

            // Save the file
            using (FileStream file = new FileStream(filepath, FileMode.Open, FileAccess.Write))
            {
                hssfwb.Write(file);
            }

            Console.ReadLine();
        }


来源:https://stackoverflow.com/questions/5855813/how-to-read-file-using-npoi

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!