Perl match only returning "1". Booleans? Why?

10,955

Solution 1

You need to use capturing parentheses to actually capture:

if ($record =~ /(Defect)/ ) {
    print "$1\n";
}

Solution 2

I think what you really want is to wrap the regex in parentheses:

my($foo) = $record=~ /(Defect)/;

In list context, the groups are returned, not the match itself. And your original code has no groups.

Solution 3

The =~ perl operator takes a string (left operand) and a regular expression (right operand) and matches the string against the RE, returning a boolean value (true or false) depending on whether the re matches.

Now perl doesn't really have a boolean type -- instead every value (of any type) is treated as either 'true' or 'false' when in a boolean context -- most things are 'true', but the empty string and the special 'undef' value for undefined things are false. So when returning a boolean, it generall uses '1' for true and '' (empty string) for false.

Now as to your last question, where trying to print $1 prints nothing. Whenever you match a regular expression, perl sets $1, $2 ... to the values of parenthesized subexpressions withing the RE. In your example however, there are NO parenthesized sub expressions, so $1 is always empty. If you change it to

$record =~ /(Defect)/;
print STDOUT $1;

You'll get something more like what you expect (Defect if it matches and nothing if it doesn't).

The most common idiom for regexp matching I generally see is something like:

if ($string =~ /regexp with () subexpressions/) {
    ... code that uses $1 etc for the subexpressions matched
} else {
    ... code for when the expression doesn't match at all
}

Solution 4

my($foo) = $record=~ /Defect/;
print STDOUT $foo;

Rather than this you should do

$record =~ /Defect/;
my $foo = $&; # Matched portion of the $record.

As your goal seems to be to get the matched portion. The return value is true/false indicating if match was successful or not.

You may find http://perldoc.perl.org/perlreref.html handy.

Share:
10,955
ManAnimal
Author by

ManAnimal

Updated on August 03, 2022

Comments

  • ManAnimal
    ManAnimal almost 2 years

    This has got to be obvious but I'm just not seeing it.

    I have a documents containing thousands of records just like below:

    Row:1 DATA:
    [0]37755442
    [1]DDG00000010
    [2]FALLS
    [3]IMAGE
    [4]Defect
    [5]3
    [6]CLOSED
    

    I've managed to get each record separated and I'm now trying to parse out each field.

    I'm trying to match the numbered headers so that I can pull out the data that succeeds them but the problem is that my matches are only returning me "1" when they succeed and nothing if they don't. This is happening for any match I try to apply.

    For instance, applied to a simple word within each record:

    my($foo) = $record=~ /Defect/;
    print STDOUT $foo;
    

    prints out out a "1" for each record if it contains "Defect" and nothing if it contains something else.

    Alternatively:

    $record =~ /Defect/;
    print STDOUT $1;
    

    prints absolutely nothing.

    $record =~ s/Defect/Blefect/
    

    will replace "Defect" with "Blefect" perfectly fine on the other hand.

    I'm really confused as to why the returns on my matches are so screwy. Any help would be much appreciated.

  • ManAnimal
    ManAnimal over 12 years
    Brilliant. That did it. I scoured and scoured and never once did I run across capturing parentheses. I must be blind. Thank you very much.
  • smithfarm
    smithfarm about 11 years
    This was very helpful - thanks. I had forgotten the binding operator's different behavior in scalar/list context.