Remove trailing commas at the end of the string using Perl

11,884

Solution 1

I'd use juanrpozo's solution for counting but if you still want to go your way, then remove the commas with regex substitution.

$line =~ s/,+$//;

Solution 2

I suggest this more concise way of coding your program.

Note that the line my @data = split /,/, $line discards trailing empty fields (@data has only 11 fields with your sample data) so will produce the same result whether or not trailing commas are removed beforehand.

use strict;
use warnings;

open my $in, '<', 'test.csv' or die "Cannot open file for input: $!";
open my $out, '>', 'GO_MF_counts_Genes.csv' or die "Cannot open file for output: $!";

foreach my $line (<$in>) {
  chomp $line;
  my @data = split /,/, $line;
  printf $out "%s,%d\n", $data[0], scalar grep /^GO:/, @data;
}

Solution 3

You can apply grep to @array

my $mf = grep { /^GO:/ } @array;

assuming $array[0] never matches /^GO:/

Share:
11,884
Jordan
Author by

Jordan

Updated on June 04, 2022

Comments

  • Jordan
    Jordan almost 2 years

    I'm parsing a CSV file in which each line look something as below.

    10998,4499,SLC27A5,Q9Y2P5,GO:0000166,GO:0032403,GO:0005524,GO:0016874,GO:0047747,GO:0004467,GO:0015245,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

    There seems to be trailing commas at the end of each line.

    I want to get the first term, in this case "10998" and get the number of GO terms related to it. So my output in this case should be,

    Output:

    10998,7

    But instead it shows 299. I realized overall there are 303 commas in each line. And I'm not able to figure out an easy way to remove trailing commas. Can anyone help me solve this issue?

    Thanks!

    My Code:

    use strict;
    use warnings;
    
    open my $IN, '<', 'test.csv' or die "can't find file: $!";
    open(CSV, ">GO_MF_counts_Genes.csv") or die "Error!! Cannot create the file: $!\n";
    my @genes = ();
    
    my $mf;
    foreach my $line (<$IN>) {
        chomp $line;
        my @array = split(/,/, $line);
        my @GO = splice(@array, 4);
        my $GO = join(',', @GO);
        $mf = count($GO);
        print CSV "$array[0],$mf\n";
    }
    
    sub count {
        my $go = shift @_;
        my $count = my @go = split(/,/, $go);
        return $count;
    }