Sort CSV based on a certain column?
Solution 1
As CSV is a pretty complex format, it is better to use a module that does the work for us.
Following is an example using the Text::CSV module:
#!/usr/bin/env perl
use strict;
use warnings;
use constant AGE => 1;
use Text::CSV;
my $csv = Text::CSV->new();
my @rows;
while ( my $row_ref = $csv->getline( \*DATA ) ) {
push @rows, $row_ref;
}
@rows = sort { $a->[AGE] <=> $b->[AGE] } @rows;
for my $row_ref (@rows) {
$csv->combine(@$row_ref);
print $csv->string(), "\n";
}
__DATA__
name,25,female
name,24,male
name,27,female
name,21,male
Solution 2
In the spirit of there always being another way to do it, bear in mind that plain old GNU sort might be enough.
$ sort -t, -k2 -n unsorted.txt
name,21,male
name,24,male
name,25,female
name,27,female
Where the command line args are:
-t, # use comma as the record separator
-k2 # sort on the second key (record) in the line
-n # sort using numerical comparison (like using <=> instead of cmp in perl)
If you want a Perl solution, wrap it in qx() ;-)
Solution 3
There is also DBD::CSV:
#!/usr/bin/perl
use strict; use warnings;
use DBI;
my $dbh = DBI->connect('dbi:CSV:', undef, undef, {
RaiseError => 1,
f_ext => '.csv',
csv_tables => { test => { col_names => [qw' name age sex '] } },
});
my $sth = $dbh->prepare(q{
SELECT name, age, sex FROM test ORDER BY age
});
$sth->execute;
while ( my @row = $sth->fetchrow_array ) {
print join(',' => @row), "\n";
}
$sth->finish;
$dbh->disconnect;
Output:
name,21,male name,24,male name,25,female name,27,female
Solution 4
The original poster asked for no third-party modules (which I take to mean nothing from CPAN). Whilst this is restriction that will horribly limit your ability to write good modern Perl code, in this instance it's possible using the (core) Text::ParseWords module in place of the (non-core) Text::CSV. So, borrowing heavily from Alan's example, we get:
#!/usr/bin/env perl
use strict;
use warnings;
use Text::ParseWords;
my @rows;
while (<DATA>) {
push @rows, [ parse_line(',', 0, $_) ];
}
@rows = sort { $a->[1] <=> $b->[1] } @rows;
foreach (@rows) {
print join ',', @$_;
}
__DATA__
name,25,female
name,24,male
name,27,female
name,21,male
Admin
Updated on June 19, 2022Comments
-
Admin almost 2 years
I'm sure I've done this in the past and there is something small I'm forgetting, but how can I sort a CSV file on a certain column? I'm interested in answers with and without 3rd party Perl modules. Mainly methods without, since I don't always have access to install additional modules.
Example data:
name,25,female name,24,male name,27,female name,21,male
desired end result after sorting on the 2nd numeric column:
name,21,male name,24,male name,25,female name,27,female