Replacing text from a list of replacements. Added complication: backslashes

6,641

Solution 1

You'll need to escape all characters that are special in regexps, not just backslashes but also [.*^$ and the s delimiter (for sed). In Perl, use the quotemeta function.

A further issue with your attempt is that when you run set -- $line, the shell performs its own expansion: it performs globbing in addition to word splitting, so if your line contains a* b* and there are files called a1 and a2 in the current directory then you'll be replacing a1 with a2. You need to turn off globbing with set -f in this approach.

Here's a solution that mangles the replacement list directly into a list of sed arguments. It assumes that there is no space character in the source and replacement texts, but anything other than a space and a newline should be treated correctly. The first replacement adds a \ before the characters that need protecting, and the second replacement turns each line from foo bar into -e s/foo/bar/g. Warning, untested.

set -f
sed_args=$(<replacement sed -e 's~[/.*[\\^$]~\\&~g' \
                            -e 's~^\([^ ]*\)  *\([^ ]*\).*~-e s/\1/\2/g~')
sed -i $sed_args target

In Perl, you'll have fewer issues with quoting if you just let Perl read the replacement file directly. Again, untested.

perl -i -pe 'BEGIN {
   open R, "<replacement" or die;
   while (<R>) {
       chomp;
       ($from, $to, @ignored) = split / +/;
       $s{$from} = $to;
   }
   close R;
   $regexp = join("|", map {quotemeta} keys %s);
}
s/($regexp)/$s{$1}/ego'

Solution 2

For simple cases, there are simple solutions, so if you happen to have clean, plain, core words, without .?+*{}()[]\/ and maybe more fancy sed-stuff, you can transfer the list of pairs to a sed-command-file with sed:

sed -re 's,(^\\| \\|$),/,g;s/^/s/;s/$/g/' pairs.txt > pairs.sed
sed -f pairs.sed input > output

Solution 3

This is an attempt to escape the backslash using parameter expansion with pattern substitution.

$ set -- \\foo \\bar
$ echo $1
\foo
$ echo ${1/\\/\\\\}
\\foo
$ echo "This is \foo to me"
This is \foo to me
$ echo "This is \foo to me" | sed s/${1/\\/\\\\}/${2/\\/\\\\}/
This is \bar to me
$ 
Share:
6,641

Related videos on Youtube

Leo Alekseyev
Author by

Leo Alekseyev

{physicist, programmer, data scientist}

Updated on September 18, 2022

Comments

  • Leo Alekseyev
    Leo Alekseyev over 1 year

    I have a file A that contains pairs of strings, one per line:

    \old1 \new1
    \old2 \new2
    .....
    

    I would like to iterate over file A, and for each line perform the replacement (e.g. "\old1 -> \new1") globally in some file B. I had no trouble getting it to work without backslashes using sed or perl -pi -e using something like the following:

    while read -r line
    do
     set -- $line
     sed -i -e s/$1/$2/g target
    done < replacements
    

    However, I can't figure out how to make either sed or perl treat the backslashes verbatim in the replacement strings. Is there a clean solution for this?

  • Leo Alekseyev
    Leo Alekseyev about 13 years
    Perl snippet works great; thanks for the tip about quotemeta. I'm still a little surprised that running verbatim string replacements from a list didn't have some simple canned solution, but I'm happy with the Perl code.
  • Michael Mrozek
    Michael Mrozek about 13 years
    There's a pending edit suggestion about adding chomp