How to convert xlsx file to csv?

48,332

Solution 1

The text extractors will dump a TSV of the entire workbook. Performance depends on the implementation chosen and your memory availability.

You can then pipe that into a CSVPrinter to get correct CSV output. I don't think Excel cells can ever contain tab characters, so this should be safe. If you have newlines in your cells I'm not sure whether the TSV output will be valid, but if it is you can use a CSVParser to read it instead of lines().

XSSFWorkbook input = new XSSFWorkbook(new File("input.xlsx"));
CSVPrinter output = new CSVPrinter(new FileWriter("output.csv"), CSVFormat.DEFAULT); 

String tsv = new XSSFExcelExtractor(input).getText();
BufferedReader reader = new BufferedReader(new StringReader(tsv));
reader.lines().map(line -> line.split("\t").forEach(output::printRecord);

If buffering the whole this as a String is too inefficient, copy the implementation of one of the event-based extractors and write directly to the CSVPrinter instead.

Solution 2

Simple way to convert xls/xlsx into csv by using apache POI.

public class XLSXToCSVConverter {

    public InputStream convertxlstoCSV(InputStream inputStream) throws IOException, InvalidFormatException {

        Workbook wb = WorkbookFactory.create(inputStream);

        return  csvConverter(wb.getSheetAt(0));
    }

    private InputStream csvConverter(Sheet sheet) {
        Row row = null;
        String str = new String();
        for (int i = 0; i < sheet.getLastRowNum()+1; i++) {
            row = sheet.getRow(i);
            String rowString = new String();
            for (int j = 0; j < 3; j++) {
                if(row.getCell(j)==null) {
                    rowString = rowString + Utility.BLANK_SPACE + Utility.COMMA;
                }
                else {
                    rowString = rowString + row.getCell(j)+ Utility.COMMA;
                }
            }
            str = str + rowString.substring(0,rowString.length()-1)+ Utility.NEXT_LINE_OPERATOR;
        }
        System.out.println(str);
        return new ByteArrayInputStream(str.getBytes(StandardCharsets.UTF_8));
    }
}

Hope this will help you , Thanks.

Share:
48,332

Related videos on Youtube

Tharani
Author by

Tharani

Updated on October 27, 2021

Comments

  • Tharani
    Tharani over 2 years

    Currently I am using below code to convert XLSX file to CSV using Java. I need a faster solution because this is too slow.

    public class Test1 {
        static void convert(File inputFile, File outputFile) {
            try {
                FileOutputStream fos = new FileOutputStream(outputFile);
                // Get the workbook object for XLSX file
                XSSFWorkbook wBook = new XSSFWorkbook(
                        new FileInputStream(inputFile));
                // Get first sheet from the workbook
                XSSFSheet sheet = wBook.getSheetAt(0);
                Row row;
                Cell cell;
                // Iterate through each rows from first sheet
                Iterator<Row> rowIterator = sheet.iterator();
    
                while (rowIterator.hasNext()) {
                    row = rowIterator.next();
    
                    // For each row, iterate through each columns
                    Iterator<Cell> cellIterator = row.cellIterator();
                    while (cellIterator.hasNext()) {
    
                        cell = cellIterator.next();
    
                        switch (cell.getCellType()) {
                        case Cell.CELL_TYPE_BOOLEAN:
                            data.append(cell.getBooleanCellValue() + ",");
    
                            break;
                        case Cell.CELL_TYPE_NUMERIC:
                            data.append(cell.getNumericCellValue() + ",");
    
                            break;
                        case Cell.CELL_TYPE_STRING:
                            data.append(cell.getStringCellValue() + ",");
                            break;
    
                        case Cell.CELL_TYPE_BLANK:
                            data.append("" + ",");
                            break;
                        default:
                            data.append(cell + ",");
    
                        }
                    }
                }
    
                fos.write(data.toString().getBytes());
                fos.close();
    
            } catch (Exception ioe) {
                ioe.printStackTrace();
            }
        }
    
        // testing the application
    
        public static void main(String[] args) {
            // reading file from desktop
            File inputFile = new File("D:\\Test.xlsx");
            // writing excel data to csv
            File outputFile = new File("D:\\Test1.csv");
            convert(inputFile, outputFile);
        }
    }
    
    • Tharani
      Tharani over 7 years
      yes sir. process should be fast
  • Tharani
    Tharani over 7 years
    My scenario's: I need to read huge records of Xlsx file and store into oracle database. For this process, I have found better solution is converting xlsx to csv and then from csv to oracle database using preparedstatement. In your solution iteration is available. Instead of iterate from xlsx to csv,I can directly read from xlsx using Apache Poi(XSSF)and write into database. I need any other direct way to convert xlsx to csv using java sir. Regards, Tharanya B
  • intruder
    intruder about 6 years
    @Tharani I don't think you can read huge xlsx sheets using XSSF. You'll get a memory exception. Try using SXSSF (based on some other posts)
  • gene b.
    gene b. over 5 years
    The link you provided has bugs in the code, if you look at the comments.
  • hdmiimdh
    hdmiimdh about 3 years
    what's this for (int j = 0; j < 3; j++) {?
  • OrangeDog
    OrangeDog about 3 years
    This will break if any cells contain commas or newlines