How to process old excel .xls files using POI?

15,418

For old Excel format files, you have the following alternatives:

  1. HSSF, the POI implementation of the Excel '97(-2007) file format.
    • If you just want to extract the textual content, then you can use OldExcelExtractor which will pull only the text and numbers from the file.
    • If you need the values from a specific cells, then you'll need to take an approach a bit like OldExcelExtractor, process the file at the record level, and check for the co-ordinates on OldStringRecord, NumberRecord, OldFormulaRecord and friends.
  2. Like you already mentioned, JXL can handle some cases too.
  3. Use a JDBC/ODBC driver. It is not as flexible as HSSF but for some old formats it is the only way to extract the information.
Share:
15,418

Related videos on Youtube

Wael
Author by

Wael

I am a programmer with the following skills: Java, Scripting, HTML, etc Database: SQL Server, Oracle, etc.

Updated on July 03, 2022

Comments

  • Wael
    Wael almost 2 years

    I switched from jxl to poi since POI has more features. However, I wasn't able to process the xls files that were generated in the old format. Now I am getting this error:

    org.apache.poi.hssf.OldExcelFormatException: The supplied spreadsheet seems to be Excel 5.0/7.0 (BIFF5) format. POI only supports BIFF8 format (from Excel versions 97/2000/XP/2003)

    Now I am thinking to use both JXL as wells as POI depending on the xls version so for old format xls files I will use jxl while for newer versions I will use POI. Is this a good solution? Are there any alternatives?

    • Łukasz Rżanek
      Łukasz Rżanek over 11 years
      Is that, in fact, an Excel 5.0/7.0 file?
    • Wael
      Wael over 11 years
      Yes I validated that it is an Excel 5/7 file (Office 95)
    • Nimble Fungus
      Nimble Fungus over 8 years
      Using single API is definitely better as it would have reduced the complexity a lot. But only these two are the most mature API to read Excel. So as per my opinion its is the best way of doing it.
  • James
    James over 8 years
    The link referenced by the text "JDBC/ODBC" does not appear to point to any relevant content.
  • dan
    dan over 8 years
    @James Thanks, I updated the link to a new url, it seems that the previous page was removed :(
  • JHDev
    JHDev over 8 years
    Hi Thanks your answers, just I would know if there is a way to detect the format of Excel file is BIFF5?
  • dan
    dan over 8 years
    @esprittn You can check the BOF (Beginning of File) record. See page 43 from here: download.microsoft.com/download/0/B/E/… for more details.
  • Volksman
    Volksman almost 4 years
    I think you are referring to issues going from the old binary .xls format to newer XML based .xlsx formats but I think the question is referring to very old .xls binary formats that POI can't read - it can read newer .xls files - nothing to do with being binary or XML based format - just the older .xls files seem to not be supported by POI.