Sort CSV based on a certain column?

10,364

Solution 1

As CSV is a pretty complex format, it is better to use a module that does the work for us.

Following is an example using the Text::CSV module:

#!/usr/bin/env perl

use strict;
use warnings;

use constant AGE => 1;

use Text::CSV;

my $csv = Text::CSV->new();

my @rows;
while ( my $row_ref = $csv->getline( \*DATA ) ) {
    push @rows, $row_ref;
}

@rows = sort { $a->[AGE] <=> $b->[AGE] } @rows;

for my $row_ref (@rows) {
    $csv->combine(@$row_ref);
    print $csv->string(), "\n";
}

__DATA__
name,25,female
name,24,male
name,27,female
name,21,male

Solution 2

In the spirit of there always being another way to do it, bear in mind that plain old GNU sort might be enough.

$ sort -t, -k2 -n unsorted.txt
name,21,male
name,24,male
name,25,female
name,27,female

Where the command line args are:

-t, # use comma as the record separator
-k2 # sort on the second key (record) in the line
-n  # sort using numerical comparison (like using <=> instead of cmp in perl)

If you want a Perl solution, wrap it in qx() ;-)

Solution 3

There is also DBD::CSV:

#!/usr/bin/perl

use strict; use warnings;
use DBI;

my $dbh = DBI->connect('dbi:CSV:', undef, undef, {
    RaiseError => 1,
    f_ext => '.csv',
    csv_tables => { test => { col_names => [qw' name age sex '] } },
});

my $sth = $dbh->prepare(q{
    SELECT name, age, sex FROM test ORDER BY age
});

$sth->execute;

while ( my @row = $sth->fetchrow_array ) {
    print join(',' => @row), "\n";
}

$sth->finish;
$dbh->disconnect;

Output:

name,21,male
name,24,male
name,25,female
name,27,female

Solution 4

The original poster asked for no third-party modules (which I take to mean nothing from CPAN). Whilst this is restriction that will horribly limit your ability to write good modern Perl code, in this instance it's possible using the (core) Text::ParseWords module in place of the (non-core) Text::CSV. So, borrowing heavily from Alan's example, we get:

#!/usr/bin/env perl

use strict;
use warnings;

use Text::ParseWords;

my @rows;

while (<DATA>) {
    push @rows, [ parse_line(',', 0, $_) ];
}

@rows = sort { $a->[1] <=> $b->[1] } @rows;

foreach (@rows) {
    print join ',', @$_;
}

__DATA__
name,25,female
name,24,male
name,27,female
name,21,male
Share:
10,364
Admin
Author by

Admin

Updated on June 19, 2022

Comments

  • Admin
    Admin almost 2 years

    I'm sure I've done this in the past and there is something small I'm forgetting, but how can I sort a CSV file on a certain column? I'm interested in answers with and without 3rd party Perl modules. Mainly methods without, since I don't always have access to install additional modules.

    Example data:

    name,25,female
    name,24,male
    name,27,female
    name,21,male

    desired end result after sorting on the 2nd numeric column:

    name,21,male
    name,24,male
    name,25,female
    name,27,female