Extract substring according to regexp with sed or grep

12,364

Solution 1

Try this,

sed -nE 's/^pass2:.*<(.*)>.*$/\1/p'

Or POSIXly (-E has not made it to the POSIX standard yet as of 2019):

sed -n 's/^pass2:.*<\(.*\)>.*$/\1/p'

Output:

$ printf '%s\n' 'pass2: <Marvell Console 1.01> Removable Processor SCSI device' | sed -nE 's/^pass2:.*<(.*)>.*$/\1/p'
Marvell Console 1.01

This will only print the last occurrence of <...> for each line.

Solution 2

How about -o under grep to just print the matching part? We still need to remove the <>, though, but tr works there.

dmesg |egrep -o "<([a-zA-Z\.0-9 ]+)>" |tr -d "<>"
Marvell Console 1.01

Solution 3

I tried below 3 methods by using sed, awk and python

sed command

echo "pass2: <Marvell Console 1.01> Removable Processor SCSI device" | sed "s/.*<//g"|sed "s/>.*//g"

output

Marvell Console 1.01

awk command

echo "pass2: <Marvell Console 1.01> Removable Processor SCSI device" | awk -F "[<>]" '{print $2}'

output

Marvell Console 1.01

python

#!/usr/bin/python
import re
h=[]
k=open('l.txt','r')
l=k.readlines()
for i in l:
    o=i.split(' ')
    for i in o[1:4]:
        h.append(i)
print (" ".join(h)).replace('>','').replace('<','')

output

Marvell Console 1.01
Share:
12,364

Related videos on Youtube

Steiner
Author by

Steiner

Updated on September 18, 2022

Comments

  • Steiner
    Steiner over 1 year

    In a (BSD) UNIX environment, I would like to capture a specific substring using a regular expression.

    Assume that the dmesg command output would include the following line:

    pass2: <Marvell Console 1.01> Removable Processor SCSI device
    

    I would like to capture the text between the < and > characters, like

    dmesg | <sed command>

    should output:

    Marvell Console 1.01
    

    However, it should not output anything if the regex does not match. Many solutions including sed -e 's/$regex/\1/ will output the whole input if no match is found, which is not what i want.

    The corresponding regexp could be: regex="^pass2\: \<(.*)\>"

    How would i properly do a regex match using sed or grep? Note that the grep -P option is unavailable in my BSD UNIX distribution. The sed -E option is available, however.

    • JdeBP
      JdeBP about 5 years
      It's possibly better to parse the output of camcontrol devlist than the output of dmesg.
  • Steiner
    Steiner about 5 years
    This works for me, with both the -n parameter and the /p suffix inside the regex. Full command i used: dmesg | sed -nE 's/^pass2: <(.*)>.*$/\1/p
  • Rich
    Rich about 5 years
    Why not use <([^>]+)>? I.e. not-> one-or-more times
  • jwm
    jwm about 5 years
    I was thinking the awk approach too. Should you constrain your print to lines beginning with "pass2:"? The OP didn't provide sufficient detail, but I can imagine that a naive pattern match would not be quite what was wanted.
  • D. Ben Knoble
    D. Ben Knoble about 5 years
    Python can read from standard in, though perl specializes in this kind of text processing if you’re moving into higher level scripting languages.
  • AdminBee
    AdminBee about 4 years
    Welcome to the site, and thank you for your contribution. The reason that + doesn't seem to work is that by default, grep interprets the regular expression as basic regular expression, which doesn't include the +. You will have to use the -E option in order to enable them (at least on GNU grep), or use egrep instead.