How can I extract the numbers in the file using sed or any other tool?

19,153

Solution 1

sed uses Basic Regular Expressions by default and BREs don't know about \d. Here are some other approaches:

  1. sed

    sed -r 's/.* ([0-9]+\.*[0-9]*).*?/\1/' logfile.txt > outfile.txt
    

    The -r is needed to avoid having to escape the parentheses.

  2. perl

    perl -pe 's/.* (\d+\.*\d*).*/$1/' logfile.txt > outfile.txt
    
  3. grep

    grep -Po '.* \K\d+\.*\d*' logfile.txt > outfile.txt
    

These all use your basic approach, which fill find all sets of digits in the line that are preceded by a space. Depending on how many sets of numbers can appear on the line, if your input lines are always of the format you show, a safer approach would be:

grep -Po 'took \K\d+\.*\d*' logfile.txt 

Solution 2

Grouping parentheses must be backslashed in sed. Also, sed doesn't support \d. Moreover, you should also remove the words after the number:

sed -e 's/^.* \([0-9]\+\.[0-9]*\) .*/\1/g'

BTW, are you sure the dot is always present, but the decimal numbers are optional? 12. doesn't seem as an expected value.

Share:
19,153

Related videos on Youtube

Jim
Author by

Jim

Updated on September 18, 2022

Comments

  • Jim
    Jim over 1 year

    I have a file that has this format

    [ 2014/05/01 10:48:26 | 13963 | DEBUG ] It took 11.16837501525879 seconds to complete the process

    So I have thousands of lines like this and I would like to "extract" the 11.16837501525879 part
    I tried:

     sed -e 's/^.* (\d+\.\d*)/\1/g' logfile.txt > out.txt  
    

    but I get:

    sed: -e expression #1, char 21: invalid reference \1 on `s' command's RHS  
    

    What can I do here?

  • Jim
    Jim about 10 years
    It worked +1! BTW what does -e do then if I need to escape everything?
  • choroba
    choroba about 10 years
    @Jim: Nothing. That's why you can omit it in some versions of sed. Also, some versions support -r in which you don't have to backslash () and + etc.
  • Jeff Hewitt
    Jeff Hewitt about 10 years
    Actually, without /g this will find the last set of digits preceded by a space. The grepone is the one that will find them all.
  • terdon
    terdon about 10 years
    @JosephR. yes, even with the g it will not do what the OP wanted, hence the grep one.