How can I convert tab delimited data to comma delimited data?
Solution 1
#!/usr/bin/awk -f
BEGIN { FS = "\t"; OFS = "," }
{
for(i = 1; i <= NF; i++) {
if ($i + 0 == $i) { $i = "=" $i }
else gsub(/"/, "\"\"", $i);
$i = "\"" $i "\""
}
print
}
Assuming you name this convert.awk
, you can either call with either
ec2-describe-snapshots -H --hide-tags | awk -f convert.awk > snapshots.csv
or (after adding execute permissions, chmod a+x convert.awk
)
ec2-describe-snapshots -H --hide-tags | ./convert.awk > snapshots.csv
This will make a new column for each
tab, which will keep the comment column together (unless it contains tabs), but add empty columns (though that is how your sample output looks, so maybe you actually do want that).
If you want to split on all whitespace (this will collapse extra tabs within the table but put each word in the description as a new column), take out the FS="\t";
statement.
For future generations, if you don't need the "
s or =
s or embedded whitespace, you can make it a one-liner:
awk -v OFS=, '{$1=$1;print}'
Solution 2
Here's a perl solution. This might be possible with sed/awk, but testing for the numeric part would likely make it pretty ugly.
ec2-describe-snapshots -H --hide-tags | \
perl -e 'use Scalar::Util qw(looks_like_number);
while (chomp($line = <STDIN>)) {
print(join(",", map { "\"" . (looks_like_number($_) ? "=$_" :
do {s/"/""/g; $_}) . "\"" }
split(/\t/, $line)) . "\n");
}' \
> snapshots.csv
Solution 3
If you're just lazy like me and want to do it all on one command line without writing a script, here's how i'd do it.
ec2-describe-snapshots -H --hide-tags | sed -e 's/^I/","/g' | sed -e 's/^/"/' | sed -e 's/$/"/'> snapshots.csv
The ^I
is made by pressing ctrl+v i.
The first sed
swaps all the tabs
for ","
. The second sed
inserts a "
at the beginning of each line, and the last sed inserts a closing "
at the end of each line.
Solution 4
Another Perl solution:
#!/usr/bin/perl -wln
use strict;
my($n,$s);chomp();
for $s ( split(/\t/,$_) )
{
$s = '='.$s if ($s =~ /^\d+$/);
$n.= '"'.$s.'",';
}
$n =~ s/(.*),/$1/;print $n;
invoke with ec2-describe-snapshots -H --hide-tags | /var/tmp/script.pl > output.txt
Solution 5
sed is the most useful linux utility I have ever encountered.
sed 's/\t/","/g' TabSeparatedValues.txt > CommaSeparatedValues.csv
sed -i 's/.*/"&"/' CommaSeparatedValues.csv
The first command replaces all tabs in every line with commas and quotes. The second command inserts quotes at the beginning and end of each line, so that each values will be surrounded in quotes, which allows commas to be part of the value.
Related videos on Youtube
cwd
Updated on September 18, 2022Comments
-
cwd over 1 year
I'm requesting a list of ec2 snapshots via amazon's ec2 command line tool:
ec2-describe-snapshots -H --hide-tags > snapshots.csv
The data looks something like this:
SnapshotId VolumeId StartTime OwnerId VolumeSize Description snap-00b66464 vol-b99a38d0 2012-01-05 5098939 160 my backup
How can I intercept the data before redirecting it to
snapshots.csv
and do the following things:- replace "tabs" with commas
- encapsulate values with quotations
- if a value is all numbers, prefix it with an
=
so that excel will treat it as text - for exampleOwnerId
should be"=5098939
" (this one is not necessary if it cannot be done inline and would instead require a script file or function)
desired output:
"SnapshotId","VolumeId","StartTime","OwnerId","VolumeSize","Description" "snap-00b66464","vol-b99a38d0","2012-01-05","=5098939","=160","my backup"
-
Ignacio Vazquez-Abrams over 12 yearsThis is where someone tells you to import using tabs. Or they would, if Excel wasn't on crack.
-
cwd over 12 yearsYeah I'm trying to help excel out a little bit since it doesn't seem to be doing so hot on it's own. Also having a CSV file that can just be opened instead of having to use the import menu command is always nice. I already tried changing the extension to ".tsv" with no luck.
-
phemmer over 12 yearsI think your desired output is a bit off. You have a lot of empty fields in there (the empty quotes).
-
phemmer over 12 yearsNice clean solution. Thought it would end up a lot uglier than that, but then I'm not a awk person :-)
-
cwd over 12 yearsso do i save this into a file such as
./convert.sh
, chmod +x, and then pipe the input into it so that it will print the output? I'm getting an error:/usr/bin/awk: syntax error at source line 1 context is >>> . <<< /convert.sh
. -
Kevin over 12 years@cwd You can save it in a file, I'd suggest
convert.awk
to indicate it's anawk
script and not abash
one. I updated the post with the full command line, and note that I added a-f
flag I had forgotten to the first line (that tells it to interpret the file as commands). -
Stylex over 12 yearsHow did you get the ctrl + v i to show up like that?
-
jw013 over 12 years@burhan The syntax is
<kbd>text</kbd>
. -
Arcege over 12 yearsOr in one line:
sed -e 's/^I/","/g' -e 's/.*/"&"/'
or even shortersed -e 's/^I/","/g;s/.*/"&"/'
. -
phemmer over 12 yearsScalar::Util isnt an outside module, it comes with standard perl.
-
Jim over 12 yearsTrue. Apologies for poorly wording my intended comment. Thank you for the correction.
-
Paul_Pedant about 4 yearsThe one-liner version treats any whitespace as a field separator, not just tabs. Needs a -F'\t' before the -V.