Reading string value from Excel with HSSF but it's double

36,843

Solution 1

You mean HSSF-POI says

cell.getCellType() == Cell.CELL_TYPE_NUMERIC

NOT

Cell.CELL_TYPE_STRING as it should be?

I would think it's a bug in POI, but every cell contains a Variant, and Variant has a type. It's kind of hard to make a bug there, so instead I think Excel uses some extra data or heuristic to report the field as text. Usual MS way, alas.

P.S. You cannot use any getString() on a Variant containing numeric, as the binary representation of the Variant data depends on it's type, and trying to get a string from what is actually a number would result in garbage -- hence the exception.

Solution 2

The class you're looking for in POI is DataFormatter

When Excel writes the file, some cells are stored as literal Strings, while others are stored as numbers. For the latter, a floating point value representing the cell is stored in the file, so when you ask POI for the value of the cell that's what it actually has.

Sometimes though, especially when doing Text Extraction (but not always), you want to make the cell value look like it does in Excel. It isn't always possible to get that exactly in a String (non full space padding for example), but the DataFormatter class will get you close.

If you're after a String of the cell, looking much as you had it looking in Excel, just do:

 // Create a formatter, do this once
 DataFormatter formatter = new DataFormatter(Locale.US);

 .....

 for(Cell cell : row) {
     CellReference ref = new CellReference(cell);
     // eg "The value of B12 is 12.4%"
     System.out.println("The value of " + ref.formatAsString() + " is " + formatter.formatCellValue(cell));
 }

The formatter will return String cells as-is, and for Numeric cells will apply the formatting rules on the style to the number of the cell

Solution 3

If the documents you are parsing are always in a specific layout, you can change the cell type to "string" on the fly and then retrieve the value. For example, if column 2 should always be string data, set its cell type to string and then read it with the string-type get methods.

cell.setCellType(Cell.CELL_TYPE_STRING);

In my testing, changing the cell type did not modify the contents of the cell, but did allow it to be retrieved with either of the following approaches:

cell.getStringCellValue();

cell.getRichStringCellValue().getString();

Without an example of a value that is not converting properly, it is difficult to know if this will behave any differently than the cell.toString() approach you described in the description.

Solution 4

This below code works fine to read any celltype but that cell should contain numeric value

new BigDecimal(cell.getNumericCellValue()));

e.g.

ase.setGss(new BigDecimal(hssfRow.getCell(3).getNumericCellValue()));

where variable gss is of BigDecimal type.

Solution 5

Excel will convert anything that looks like a number or date or time from a string. See MS Knowledge base article, which basically suggests to enter the number with an extra character that makes it a string.

Share:
36,843
egaga
Author by

egaga

Updated on April 08, 2020

Comments

  • egaga
    egaga about 4 years

    I'm using HSSF-POI for reading excel data. The problem is I have values in a cell that look like a number but really are strings. If I look at the format cell in Excel, it says the type is "text". Still the HSSF Cell thinks it's numeric. How can I get the value as a string?

    If I try to use cell.getRichStringValue, I get exception; if cell.toString, it's not the exact same value as in Excel sheet.

    Edit: until this gets resolved, I'll use

    new BigDecimal(cell.getNumericCellValue()).toString()