converting an Excel (xls) file to a comma separated (csv) file without the GUI
Solution 1
Use a perl script. Using the Spreadsheet::ParseExcel perl module from CPAN to parse the xls file followed by output as csv should work fine.
http://search.cpan.org/dist/Spreadsheet-ParseExcel
You could also try using VBScript.
Solution 2
You can use xls2csv from the catdoc package if you're on Debian/Ubuntu
Solution 3
From Gnumeric docs:
Gnumeric can convert files automatically without needing user intervention. This allows a large number of files to be converted using a script. Gnumeric is distributed along with a program called
ssconvert
which is the program used to convert files automatically. All of the file formats supported by Gnumeric can be used except for the Postscript and PDF file formats which operate through the printing system.This application is used, from the command line by specifying, any desired options, an input file and an output file. For example,
ssconvert myfile.xls myfile.gnumeric
would convert an Excel format file to a Gnumeric format file.
The available import and export file formats which ssconvert can read can be listed using
ssconvert --list-importers
or
ssconvert --list-exporters
respectively.
Like other GNU command line applications, ssconvert includes a manual page. This page can be accessed by typing:
man ssconvert
which will open the manual page. This page can be navigated by typing the space bar or using the Page Up and Page Down buttons. The man program can be dismissed by typing the q key.
I'm using it and works well.
Solution 4
In Java world you can use apache poi. You could start from the following Groovy snippet.
FileInputStream fis = new FileInputStream(filename);
Workbook wb = new HSSFWorkbook(fis);
Sheet sheet = wb.getSheetAt(0);
for (Row row : sheet) {
for (Cell cell : row) {
doSomething(cell.toString())
}
}
Solution 5
Excel can be used as datasource and there are drivers available to access EXCEL as database.
1.) Create and Open a connection to EXCEL file, which you want to convert into CSV.
2.) Fire a query like "SELECT * From Sheet1", which will load all the data of Sheet1 into recordset or datatable.
3.) Since I'm using .net, I can hold those records on datatable and convert into CSV using following extension method.
public static string ToCSV(this DataTable _dataTable)
{
StringBuilder csv = new StringBuilder();
StringWriter sw = new StringWriter(csv);
int icolcount = _dataTable.Columns.Count;
for (int i = 0; i < icolcount; i++)
{
sw.Write(_dataTable.Columns[i]);
if (i < icolcount - 1)
{
sw.Write(",");
}
}
sw.Write(sw.NewLine);
foreach (DataRow drow in _dataTable.Rows)
{
for (int i = 0; i < icolcount; i++)
{
if (!Convert.IsDBNull(drow[i]))
{
sw.Write(drow[i].ToString());
}
if (i < icolcount - 1)
{
sw.Write(",");
}
}
sw.Write(sw.NewLine);
}
sw.Close();
return csv.ToString();
}
You can apply this approach on the platform you're working on.
Thanks.
![nik](https://i.stack.imgur.com/DdGHY.jpg?s=256&g=1)
Comments
-
nik about 4 years
Is there a simple way to translate an XLS to a CSV formatted file without starting the Excel windowed application?
I need to process some Excel XLS workbooks with scripts. For this i need to convert the xls file into a csv file. This can be done with a save-as from the Excel application. But, i would like to automate this (so, not open the Excel application window).
It will suffice if the first sheet from the workbook gets translated to the CSV format. I need to just process data in that sheet.
I have Cygwin and Excel installed on my system -- if that helps.
Edit: Ok, i have a working solution with Perl. Updating for future use by others.
I installed the Spreadsheet::ParseExcel module. and then used read-excel.pl sample.
My code is a slight variation of this sample code, as below.
#!/usr/bin/perl -w # For each tab (worksheet) in a file (workbook), # spit out columns separated by ",", # and rows separated by c/r. use Spreadsheet::ParseExcel; use strict; my $filename = shift || "Book1.xls"; my $e = new Spreadsheet::ParseExcel; my $eBook = $e->Parse($filename); my $sheets = $eBook->{SheetCount}; my ($eSheet, $sheetName); foreach my $sheet (0 .. $sheets - 1) { $eSheet = $eBook->{Worksheet}[$sheet]; $sheetName = $eSheet->{Name}; print "#Worksheet $sheet: $sheetName\n"; next unless (exists ($eSheet->{MaxRow}) and (exists ($eSheet->{MaxCol}))); foreach my $row ($eSheet->{MinRow} .. $eSheet->{MaxRow}) { foreach my $column ($eSheet->{MinCol} .. $eSheet->{MaxCol}) { if (defined $eSheet->{Cells}[$row][$column]) { print $eSheet->{Cells}[$row][$column]->Value . ","; } else { print ","; } } print "\n"; } }
Update: Here is a Powershell script that might also be easy to work with; as-is from this MSDN blog and, SO Reference.
$excel = New-Object -comobject Excel.Application $workbooks = $excel.Workbooks.Open("C:\test.xlsx") $worksheets = $workbooks.Worksheets $worksheet = $worksheets.Item(1) $range = $worksheet.UsedRange foreach($row in $range.Rows) { foreach($col in $row.Columns) { echo $col.Text } }
Update: I recently came across a Windows tool CSVed at this Superuser answer which might be useful to some people.
-
nik about 15 yearsThis is probably a good way to go too. I got the Perl solution working for me fast so i stopped on this.
-
nik over 14 yearsThat sounds interesting, I'll check on Ubuntu.
-
Admin about 12 yearsThis is explored a little further in a duplicate question: stackoverflow.com/questions/1858195/…