sed -r vs. sed - exactly how are the regex possibilities extended?

6,951

According to info sed, Extended regexps are those that 'egrep' accepts; they can be clearer because they usually have less backslashes, but are a GNU extension and hence scripts that use them are not portable. egrep being a synonym for grep -E.

This is indeed the case: without:

echo "abcdef" | sed 's/\([cd]\+\)/\U\1/'
abCDef

With:

echo "abcdef" | sed -r 's/([cd]+)/\U\1/'
abCDef

Some expressions may be valid with both, but in many cases they will be interpreted differently. The character escaping logic in regular, POSIX-compliant sed totally escapes me.

Share:
6,951

Related videos on Youtube

Lew Rockwell Fan
Author by

Lew Rockwell Fan

Updated on September 18, 2022

Comments

  • Lew Rockwell Fan
    Lew Rockwell Fan over 1 year

    In bash, as I understand it, I can use characters like . & ^ * in regular expressions with sed, but the -r option changes the nature of how regular expressions are, uh, expressed, kinda like grep vs. grep -E. But I can't find any summary of exactly HOW the syntax changes. Is there a list somewhere? Am I being naive in thinking this is the kind of thing that it ought to be possible to summarize in a table that could be printed on a couple of pages?

    Do the characters that work with plain old non-extended sed regex expressions, still work the same way with the -r option? In other words are expressions that are valid WITHOUT the -r option, still valid, and still mean the same thing, WITH the -r option? Like they were a subset of the expressions valid WITH the -r option?

    I keep thinking there must be a pithy summary of the difference with examples somewhere.

    • Hubert Grzeskowiak
      Hubert Grzeskowiak almost 6 years
      FYI on MacOS the option for sed is -E. If you want cross-platform scripts, do not use this option
    • Noam Manos
      Noam Manos almost 6 years
      The only difference between basic and extended regular expressions is in the behavior of a few characters: ‘?’, ‘+’, parentheses, and braces (‘{}’). See Extended regular expressions
  • Lew Rockwell Fan
    Lew Rockwell Fan almost 7 years
    @ xenoid Thanks, I forgot about "info". I'll look at that and man and info for grep and egrep & get back to this after some study. So you're saying valid regexes for sed sans -r are NOT a subset of those valid for sed -r?
  • xenoid
    xenoid almost 7 years
    Yes, they are different (they basically change the characters to escape)
  • Lew Rockwell Fan
    Lew Rockwell Fan almost 7 years
    OK, studied it. Got it. Or at least enough of it for now. There are pretty good summaries in info sed and info grep. It looks like there are at least some situations where you can do things with extended that you can't with, ahem, "regular" regular expressions, but not vice versa. So, I'm going to try to make a habit of using grep with -E and sed with -r from now on, and learn that syntax more thoroughly. I don't really care about posix.