java.lang.OutOfMemoryError: GC overhead limit exceeded when loading an xlsx file

天大地大妈咪最大 提交于 2019-12-23 02:32:54

问题


I understand what the error means, that my program is consuming too much memory and for a long period of the time it is not recovering.

My program is just reading 6,2Mb xlsx file when the memory issue occures.

When I try to monitor the program, it very quickly reaches 1,2Gb in memory consumption and then it crashes. How can it reach 1,2Gb when reading 6,2Mb file?

Is there a way to open the file in chunks? So that it doesn't have to be loaded to the memory? Or any other solution?

Exactly this part causes it. But since it is a library, shouldn't it be handled somehow smartly? It is only 200 000 rows with only 3 columns. For future, I need it to work with approx. 1 mil records and more columns...

CODE:

  Workbook myWorkBook;
        Sheet mySheet;
        if (filePath.contains(".xlsx")) {
            // Finds the workbook instance for XLSX file
             myWorkBook = new XSSFWorkbook(fis);
            // Return first sheet from the XLSX workbook
             mySheet = myWorkBook.getSheetAt(0);
             myWorkBook.close(); // Should I close myWorkBook before I get data from it?
        } 

回答1:


If you wish to work with large XLSX files, you need to use the streaming XSSFReader class. Since the data is XML, you can use StAX to effectively process the contents.

Here's (one way) how to get the Inputstream from the xlsx.

OPCPackage opc = OPCPackage.open(file);
XSSFReader xssfReader = new XSSFReader(opc);
SharedStringsTable sst = xssfReader.getSharedStringsTable();
XSSFReader.SheetIterator itr = (XSSFReader.SheetIterator)xssfReader.getSheetsData();
while(itr.hasNext()) {
    InputStream sheetStream = itr.next();
    if(itr.getSheetName().equals(sheetName)) {  // Or you can keep track of sheet numbers
        in = sheetStream;
        return;
    } else {
        sheetStream.close();
    }
}

The elements are <row>, and <c> (for cell). You can create a small xlsx file, unzip it and examine the XML inside for more information.

Edit: There are some examples on processing the data with SAX, but using StAX is a lot nicer and just as efficient.



来源:https://stackoverflow.com/questions/31873931/java-lang-outofmemoryerror-gc-overhead-limit-exceeded-when-loading-an-xlsx-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!