Error reading Excel .XLSX with Apache POI

46,903

Solution 1

The method using the above code has a single parameter - FileInputStream. The first line in the code snippet is very much part of the code, but part of the invoking method. Since the method in question did not have knowledge of the Excel format or even a file extension to make an educated guess, I decided that I would first try to read the FileInputStream using HSSF API as below:

Sheet sheet = null;
try {

    POIFSFileSystem poifs = new POIFSFileSystem(inputFS);
    Workbook workbook = new HSSFWorkbook(poifs);
    sheet = workbook.getSheetAt(0);
}
catch (Exception e) {
}

if (sheet == null) {

    try {

        Workbook workbook = new XSSFWorkbook(inputFS);
        sheet = workbook.getSheetAt(0);
    }
    catch (Exception e) {
    }
}

The problem with the above code is that the state of the inputFS object during the second attempt of opening it via the XSSF API is unknown. And this yielded a read error. I replaced the above with the following code, which works fine and the issue appears to be resolved:

Sheet sheet = null;
try {

    Workbook workbook = WorkbookFactory.create(inputFS);
    sheet = workbook.getSheetAt(0);
}
catch (Exception e) {
}

I tested this with both XLS (older, binary) and XLSX (newer, XML-based) formats and it works. Thanks for everyone's help and input!

Solution 2

The exception indicates that there's something up with your InputStream. However, if you have a file, then pass that in to POI directly!. Using an InputStream requires buffering of everything into memory, which eats up space. Since you don't need to do that buffering, don't! Avoiding that buffering ought to fix your problem anyway

If you're running with the latest nightly builds of POI, then it's very easy. Your code becomes:

File file = new File("test.xlsx");
OPCPackage opcPackage = OPCPackage.open(file);
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);

Otherwise, it's very similar:

File file = new File("test.xlsx");
OPCPackage opcPackage = OPCPackage.open(file.getAbsolutePath());
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);

If you're not sure if your file is a HSSFWorkbook or XSSFWorkbook, then you can use the WorkbookFactory to open the appropriate one for you:

File file = new File("test.xlsx");
Workbook workbook = WorkbookFactory.create(file);

Solution 3

it looks like you need to use their XSSF API

Solution 4

I have same error, I have just updated the pom dependencies with same version. It worked.

        <!-- https://mvnrepository.com/artifact/org.apache.poi/poi -->
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>4.1.0</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.poi/poi-ooxml -->
        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml</artifactId>
            <version>4.1.0</version>
        </dependency>
Share:
46,903
Web User
Author by

Web User

Updated on January 07, 2021

Comments

  • Web User
    Web User over 3 years

    I am using Apache POI 3.8 libraries to read an XLSX file in a web application. The following code works perfectly fine from a Java console app:

    InputStream inputFS = new FileInputStream("test.xlsx");
    Workbook workbook = new XSSFWorkbook(inputFS); // below exception is thrown on this line
    Sheet sheet = workbook.getSheetAt(0);
    

    but throws a "read error" when used in the web application. A relevant extract of the stack trace is pasted below:

    java.io.IOException: Read error
    at java.io.FileInputStream.readBytes(Native Method) ~[na:1.6.0_31]
    at java.io.FileInputStream.read(Unknown Source) ~[na:1.6.0_31]
    at java.io.FilterInputStream.read(Unknown Source) ~[na:1.6.0_31]
    at java.io.PushbackInputStream.read(Unknown Source) ~[na:1.6.0_31]
    at java.util.zip.ZipInputStream.readFully(Unknown Source) ~[na:1.6.0_31]
    at java.util.zip.ZipInputStream.readLOC(Unknown Source) ~[na:1.6.0_31]
    at java.util.zip.ZipInputStream.getNextEntry(Unknown Source) ~[na:1.6.0_31]
    at org.apache.poi.openxml4j.util.ZipInputStreamZipEntrySource.<init>(ZipInputStreamZipEntrySource.java:51) ~[poi-ooxml-3.8-20120326.jar:3.8]
    at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:83) ~[poi-ooxml-3.8-20120326.jar:3.8]
    at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:228) ~[poi-ooxml-3.8-20120326.jar:3.8]
    at org.apache.poi.util.PackageHelper.open(PackageHelper.java:39) ~[poi-ooxml-3.8-20120326.jar:3.8]
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:187) ~[poi-ooxml-3.8-20120326.jar:3.8]
    at com.corp.ReportManager.parseExcelReport(ReportManager.java:575) [ReportManager.class:na]
    

    The following JARs are included in the classpath (in the same order):

    poi-3.8-20120326.jar
    poi-ooxml-3.8-20120326.jar
    poi-ooxml-schemas-3.8-20120326.jar
    xbean.jar
    dom4j-1.6.1.jar
    

    There does not seem to be an memory related issues since I gathered some heap utilization stats just before invoking the above code. The XLSX file is 1.15 MB in size.

    ##### Heap utilization statistics [MB] #####
    Used Memory:13 MB
    Free Memory:9 MB
    Total Memory:23 MB
    Max Memory:247 MB
    
  • Web User
    Web User almost 12 years
    I am using their XSSF API through the SS API. Like I said earlier, the exact same code works just fine in a console application. It is the web app that has a problem while executing the second line in the code.
  • John Kane
    John Kane almost 12 years
    And you can work with the older format from a web application? It seems like it would have to be either that you cannot access the file for whatever reason, or you cannot read the format it is using.
  • John Kane
    John Kane almost 12 years
    It looks like they are using a XSSFReader in their example code, which you are not doing. However, that doesnt explain why it would work in a console app... Which gives more support for it being a permission issue. Maybe try to see if your app can see/read it?
  • Web User
    Web User almost 12 years
    Yes, the web app reads the older format just fine. I know I can access the file, because inputFS (FileInputStream) object is very much available for use.
  • John Kane
    John Kane almost 12 years
    Maybe POI requires write permissions? It would only fail if you couldnt read it? It just seems really odd that the exact same file could be read from a console app and not a web app with the same backing code, unless there was some sort of security type issue. Where is the file stored?
  • Andy
    Andy almost 12 years
    @WebUser Any chance your file is opened in some other program?
  • Web User
    Web User almost 12 years
    You are both right... actually what happened is that I first try to open the file using HSSF (assuming it is XLS format). It throws an exception and I used the same FileInputStream object in the constructor of XSSFWorkbook. That seemed to have messed up the state of the FileInputStream object. Thanks for pointing me in the right direction!
  • Web User
    Web User almost 12 years
    Yes, I ended up using WorkbookFactory.create(...) since I do not know if the file is HSSFWorkbook or XSSFWorkbook.
  • Gagravarr
    Gagravarr almost 12 years
    Make sure you use the File version rather than the InputStream version if you have a real File object - it'll be quicker and lower memory!
  • victorio
    victorio over 10 years
    why do I get java.lang.ClassNotFoundException: org.apache.poi.ss.usermodel.WorkbookFactory, when I have all the jars in the lib? (NetBeans)
  • Gagravarr
    Gagravarr over 10 years
    Make sure you have both the POI jars and their dependencies on your classpath, and make sure you don't have an older copy on their getting in the way. The POI components page details what you need!