Count number of column in a pipe delimited file
Solution 1
awk -F\| '{print NF}'
gives correct result.
Solution 2
Pure Unix solution (without awk/Perl):
$ cat /tmp/x1
1|2|3|34
4534|23442|1121|334434
$ head -1 /tmp/x1 | tr "|" "\012" | wc -l
4
Perl solution - 1-liner:
$ perl5.8 -naF'\|' -e 'print scalar(@F)."\n";exit;' /tmp/x1
4
BUT!!!! IMPORTANT!!!
Every one of these solutions - as well as those on other answers - do NOT work 100%!
Namely, they all break when it's a REAL "pipe-separated" file, with a pipe being a valid character in the field (and the field being quoted), the way real CSV files work.
E.g.
$ cat /tmp/x2
"0|1"|2|3|34
4534|23442|1121|334434
$ perl5.8 -naF'\|' -e 'print scalar(@F)."\n";exit;' /tmp/x1
5 <----- BROKEN!!! There are only 4 fields, first field is "0|1"
To fix that, a proper CSV (or delimited file) parser should be used, such as one in Perl:
$ perl5.8 -MText::CSV_XS
-ne '$csv=Text::CSV_XS->new({sep_char => "|"}); $csv->parse($_);
print $csv->fields(); print "\n"; exit;' /tmp/x2
Prints correct value
4
As a note, simply fixing an awk
or sed
solution with a convoluted RegEx won't work easily, since on top of pipe-containing-and-quoted PSV fields, the spec also allows quotes as part of the field as well. That does NOT lend itself to a nice RegEx solution.
Maulzey
Updated on July 09, 2022Comments
-
Maulzey almost 2 years
I have a pipe
|
delimited file.File:
106232145|"medicare"|"medicare,medicaid"|789
I would like to count the number of fields in each line. I tried the below code
Code:
awk -F '|' '{print NF-1}'
This returns me the result as 5 instead of 4. This is because the awk takes "medicare|medicaid" as two different fields instead of one field