Can't read Excel 2010 file with Apache POI. First Row number is -1

后端 未结 1 1948
小蘑菇
小蘑菇 2021-01-16 04:08

I am trying the this testfile with the Apache POI API (current version 3-10-FINAL). The following test code

import java.io.FileInputStream;
import org.apach         


        
相关标签:
1条回答
  • 2021-01-16 04:37

    The testfile.xlsx is created with "SpreadsheetGear 7.1.1.120". Open the XLSX file with a software which can deal with ZIP archives and look into /xl/workbook.xml to see that. In the worksheets/sheet?.xml files is to notice that all row elements are without row numbers. If I put a row number in the first row-tag like <row r="1"> then apache POI can read this row.

    If it comes to the question, who is to blame for this, then the answer is definitely both Apache Poi and SpreadsheetGear ;-). Apache POI because the attribute r in the row element is optional. But SpreadsheetGear also because there is no reason not to use the r attribute if Excel itself does it ever.

    If you cannot get the testfile.xlsx in a format which can Apache POI read directly, then you must work with the underlying objects. The following works with your testfile.xlsx:

    import org.apache.poi.xssf.usermodel.*;
    import org.apache.poi.ss.usermodel.*;
    import org.apache.poi.ss.util.*;
    import org.apache.poi.openxml4j.exceptions.InvalidFormatException;
    
    import java.io.FileNotFoundException;
    import java.io.IOException;
    import java.io.FileInputStream;
    import java.io.InputStream;
    
    import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet;
    import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetData;
    import org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow;
    
    import java.util.List;
    
    class Testfile {
    
     public static void main(String[] args) {
      try {
    
       InputStream inp = new FileInputStream("testfile.xlsx");
       Workbook wb = WorkbookFactory.create(inp);
    
       Sheet sheet = wb.getSheetAt(0);
    
       System.out.println(sheet.getFirstRowNum());
    
       CTWorksheet ctWorksheet = ((XSSFSheet)sheet).getCTWorksheet();
    
       CTSheetData ctSheetData = ctWorksheet.getSheetData();
    
       List<CTRow> ctRowList = ctSheetData.getRowList();
    
       Row row = null;
       Cell[] cell = new Cell[2];
    
       for (CTRow ctRow : ctRowList) {
        row = new MyRow(ctRow, (XSSFSheet)sheet);
        cell[0] = row.getCell(0);
        cell[1] = row.getCell(1);
        if (cell[0] != null && cell[1] != null && cell[0].toString() != "" && cell[1].toString() != "") 
           System.out.println(cell[0].toString()+"\t"+cell[1].toString());
       }
    
      } catch (InvalidFormatException ifex) {
      } catch (FileNotFoundException fnfex) {
      } catch (IOException ioex) {
      }
     }
    }
    
    class MyRow extends XSSFRow {
     MyRow(org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow row, XSSFSheet sheet) {
      super(row, sheet);
     }
    }
    

    I have used:

    • org.openxmlformats.schemas.spreadsheetml.x2006.main.CTWorksheet
    • org.openxmlformats.schemas.spreadsheetml.x2006.main.CTSheetData
    • org.openxmlformats.schemas.spreadsheetml.x2006.main.CTRow

    Which are part of the Apache POI Binary Distribution poi-bin-3.10.1-20140818 and there are within poi-ooxml-schemas-3.10.1-20140818.jar

    For a documentation see http://grepcode.com/snapshot/repo1.maven.org/maven2/org.apache.poi/ooxml-schemas/1.1/

    And I have extend XSSFRow, because we can't use the XSSFRow constructor directly since it has protected access.

    0 讨论(0)
提交回复
热议问题