Print back-reference in regular expression

8,706

Solution 1

I asked this question on SO as well, and got this answer from potong that does what I was looking for.

sed '/'"$regex"'/!b;s//\n\1\n/;s/.*\n\(.*\)\n.*/\1/' file

Take note that it doesn't depend on knowledge of what's in $regex to work. It uses newlines as a sentinel value in order to later replace the entire line with just the back-reference.

Solution 2

I don't know sed well enough to answer, but if you are flexible and use grep:

grep --only-matching "complex_regex" file

or

grep -o "complex_regex" file

The --only-matching (or the short form -o) flag tells grep to print out just the matched part, not the whole line.

Share:
8,706

Related videos on Youtube

jakesandlund
Author by

jakesandlund

Updated on September 18, 2022

Comments

  • jakesandlund
    jakesandlund over 1 year

    I was hoping for a way to make sed replace the entire line with the replacement (rather than just the match) so I could do something like this:

    sed -e "/$some_complex_regex_with_a_backref/\1/"
    

    and have it only print the back-reference.

    From this question, it seems like the way to do it is mess around with the regex to match the entire line, or use some other tool (like perl). Simply changing the regex to .*regex.* doesn't always work (as mentioned in that question). For example:

    $ echo $regex
    \([:alpha:]*\)day
    
    $ echo $phrase
    it is Saturday tomorrow
    
    $ echo $phrase | sed "s/$regex/\1/"
    it is Satur tomorrow
    
    $ echo $phrase | sed "s/.*$regex.*/\1/"
    
    $ # what I'd like to have happen
    $ echo $phrase | [[[some command or string of commands]]]
    Satur
    

    I'm looking for the most concise way to do this assuming the following:

    • The regex is in a variable, so can't be changed on a case by case basis.
    • I'd like to do this without using perl or other beefier languages.
  • jakesandlund
    jakesandlund about 12 years
    This works in this case, yes. However, I'm looking for a programmatic way to do this that works on any $regex variable. If, for example, $regex was ^something, that space before it would make it fail to match anything.
  • jakesandlund
    jakesandlund about 12 years
    This looks like it would be useful in cases where ignoring non-matching lines is ok. After the grep -o, a sed "s/$regex/\1/" would give me what I want. It's not a drop-in replacement for the fictitious sed command that replaces the entire line with the replacement text, and leaves non-matching lines alone.
  • mgjk
    mgjk about 12 years
    See the edit... I added some stuff about \b .