Perl: remove a part of string after pattern

15,825

Solution 1

Do grouping to save first numbers of digits found and use .* to delete from there until end of line:

#!/usr/bin/env perl

use warnings;
use strict;

while ( <DATA> ) { 
    s/(\d+).*$/$1/ && print;
}

__DATA__
trn_425374_1_94_-
trn_12_1_200_+
trn_2003_2_198_+

It yields:

trn_425374
trn_12
trn_2003

Solution 2

If you are running Perl 5 version 10 or later then you have access to the \K ("keep") regular expression escape. Everything before the \K is excluded from the substitution, so this removes everything after the first sequence of digits (except newlines)

s/\d+\K.+//;

with earlier versions of Perl, you will have to capture the part of the string you want to keep, and replace it in the substitution

s/(\D*\d+).+/$1/;

Note that neither of these will remove any trailing newline characters. If you want to strip those as well, then either chomp the string first, or add the /s modifier to the substitution, like this

s/\d+\K.+//s;

or

s/(\D*\d+).+/$1/s;
Share:
15,825
user2245731
Author by

user2245731

Updated on June 04, 2022

Comments

  • user2245731
    user2245731 almost 2 years

    I have strings like this:

    trn_425374_1_94_-
    trn_12_1_200_+
    trn_2003_2_198_+
    

    And I want to split all after the first number, like this:

    trn_425374
    trn_12
    trn_2003
    

    I tried the following code:

    $string =~ s/(?<=trn_\d)\d+//gi;
    

    But returns the same as the input. I have been following examples of similar questions but I don't know what I'm doing wrong. Any suggestion?