Can I use a batch file to remove blank columns in a .csv

11,388

Solution 1

For /F is capable of parsing a .csv file. In one line, to be run directly from the command prompt:

for /F "tokens=1,2,3* delims=," %i in (input.csv) do @echo %i,%l>>output.csv

This traverses input.csv line for line and using a comma as delimiter, stores the first value in %i, the second in %j, the third in %k and the rest of the string (by using an asterisk in the list of tokens) in %l. By printing all but the second and third values, you are effectively taking those out.

However, for /F does not count empty strings as tokens, so the command cannot differentiate between two subsequent commas and a single one. This solution does work if the empty columns have a whitespace character, but may skip the wrong columns otherwise.

To circumvent this restriction, we can write a batch file that uses string replacement to insert a space between two subsequent commas. This replacement is executed twice to cope with the possibility of three commas. The quotes in ^"!line!^" are required to indicate for /F has to operate on a string instead of a file and are escaped to prevent !line! from being evaluated as a literal string instead of a variable.

@echo off
setlocal EnableDelayedExpansion
for /F "delims=" %%a in (input.csv) do (
    set line=%%a
    set line=!line:,,=, ,!
    set line=!line:,,=, ,!
    for /F "tokens=1,2,3* delims=," %%i in (^"!line!^") do (
        echo %%i,%%l>>output.csv
    )
)

The result is a new .csv file, identical to input.csv, except for the omitted second and third columns.

Solution 2

You did not specify which OS. If yours comes with cut, use this.

cut -d',' -f1,4-9

-d sets the delimiter, in this case to a ,. Since comma is not a special character, you can omit the quotes.
-f selects the fields, in this case field 1 and fields 4 till 9 (skipping fields 2 and 3).

Testfile "example.cvs"
foo1,   ,   , ba1, baz1, blah1, wibble1, wubble1
foo2,   ,   , ba2, baz2, blah2, wibble2, wubble2
foo2,   ,   , ba3,     , blah3, wibble3, wubble3
foo2,   ,   , ba4, baz4, blah4, wibble4, wubble4

An actual test of the command:

> cut -d',' -f1,4-9 example.cvs

foo1, ba1, baz1, blah1, wibble1, wubble1
foo2, ba2, baz2, blah2, wibble2, wubble2
foo2, ba3,     , blah3, wibble3, wubble3
foo2, ba4, baz4, blah4, wibble4, wubble4

Cut comes preinstalled with most operating systems (e.g., OS/X, Linux distributions, BSDs, Solaris, ...; just about everyone except DOS and Windows).

If you have Windows, you can:

  1. Open the CVS file in a spreadsheet (e.g., OpenOffice Calc or MS Excel), delete two columns, and then save the file. Hard to automate though, so use only if you need to do it once or rarely.
  2. or download Unix tools for Windows and put the cut command in a batch file.
Share:
11,388

Related videos on Youtube

batchnoober
Author by

batchnoober

Updated on September 18, 2022

Comments

  • batchnoober
    batchnoober almost 2 years

    I have nine columns in my current .csv file, however, the 2nd and 3rd columns are blank and I want to get rid of them. Is there a way to use batch commands to remove these empty columns from my .csv or export a new .csv without those blank columns?

    • Gord Thompson
      Gord Thompson over 11 years
      Please edit your question to confirm: (1) Are you running Windows? (2) If so, are you looking for a pure Windows batch solution, or are you open to other alternatives?
  • Gord Thompson
    Gord Thompson over 11 years
    That will corrupt the structure of the file if any of the other columns have any empty values.
  • batchnoober
    batchnoober over 11 years
    My OS is Windows. Unfortunately, I would want this as an automated process since this is a .csv that's created everyday in the same format. Are there other alternatives for windows OS?
  • ganesh
    ganesh over 11 years
    You can install a cut command (e.g. from gnuwin32.sourceforge.net/packages/coreutils.htm ), or you can try something with for. An example of using for is for /f "tokens=5 delims= " %i in (file.txt) DO echo %i
  • Scott - Слава Україні
    Scott - Слава Україні about 11 years
    Note that this will add blanks to empty fields in columns other than 2 and 3. For example, The,quick,brown,fox,,,the,lazy,dog will become The,,,fox, , ,the,lazy,dog.
  • Scott - Слава Україні
    Scott - Слава Україні about 11 years
    Unless you know the maximum number of columns, use -f1,4- to include all columns after the third. 4- means 4-∞.
  • ganesh
    ganesh about 11 years
    True. I based my -f4-9 on the first sentence of the OP post "I have nine columns", but -f4- is more flexible less less typing. An other option would have been to used sed.