XSSFWorkbook takes a lot of time to load

26,735

First up, don't load a XSSFWorkbook from an InputStream when you have a file! Using an InputStream requires buffering of everything into memory, which eats up space and takes time. Since you don't need to do that buffering, don't!

If you're running with the latest nightly builds of POI, then it's very easy. Your code becomes:

File file = new File("C:\\D\\Data Book.xlsx");
OPCPackage opcPackage = OPCPackage.open(file);
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);

Otherwise, it's very similar:

File file = new File("C:\\D\\Data Book.xlsx");
OPCPackage opcPackage = OPCPackage.open(file.getAbsolutePath());
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);
Share:
26,735
London guy
Author by

London guy

Passionate about Machine Learning, Analytics, Information Extraction/Retrieval and Search.

Updated on July 12, 2020

Comments

  • London guy
    London guy almost 4 years

    I am using the following code:

    File file = new File("abc.xlsx");
    InputStream st = new FileInputStream(file);
    XSSFWorkbook wb = new XSSFWorkbook(st);
    

    The xlsx file itself has 25,000 rows and each row has content in 500 columns. During debugging, I saw that the third row where I create a XSSFWorkbook, it takes a lot of time (1 hour!) to complete this statement.

    Is there a better way to access the values of the original xlsx file?

  • Howard Schutzman
    Howard Schutzman about 12 years
    My impression is the streaming version of POI only applies to writing files, not reading files.
  • Howard Schutzman
    Howard Schutzman about 12 years
    If that does not completely solve the problem, then you can use poi event api as a low memory footprint way to read a large file. The poi documentation contains an example here: poi.apache.org/spreadsheet/how-to.html#xssf_sax_api
  • Gagravarr
    Gagravarr about 12 years
    Correct, SXSSF is for writing only. To do low memory reading, you need the event (SAX) processing
  • London guy
    London guy about 12 years
    Thanks, will try this out. Just curious to know how will this solve the problem? Won't it buffer the contents into memory? Or will it just access the data using the original references by any chance?
  • Gagravarr
    Gagravarr about 12 years
    If you open it with a file, less will be buffered than if you open with an inputstream
  • akaushik
    akaushik about 2 years
    This approach is not working for me, my execution time stops for 5 mins at this line ' XSSFWorkbook workbook = new XSSFWorkbook(); ' and then executes the next line
  • Gagravarr
    Gagravarr about 2 years
    @akaushik It works for almost everyone else, so it's probably a bug with your system. Do a Java thread dump and see where it is blocking, then fix that