How can I find and replace only if a match forms a whole word?

13,082

Solution 1

You need to write your regular expression in a way that only matches whole words. With GNU sed, you can use \b which matches at word boundaries:

sed -i "s/\b$word\b/$replace/g"

If you know there will always be a space there, you could just add a space:

sed -i "s/ $word /$replace/g"

Now, there are also some issues with your script. Your if ... break statement is useless, the while is already taking care of that. All you need is:

#!/usr/bin/env bash
n="y"
while [ "$n" = "y" ]
do
    echo "n is $n"
    read -p "Enter the word to find = " word
    read -p "Enter word to replace = " replace
    echo "$word n $replace"
    sed -i "s/\b$word\b/$replace/g" test.txt
    echo "do you have further replacement? n or y"
    read temp
    n="$temp"
done

Solution 2

Replace the following line in the script

sed -i "s/$word/$replace/g" "test.txt"

with

sed -i "s/$\bword\b/$replace/g" test.txt

Please refer following link. http://www.rexegg.com/regex-boundaries.html#wordboundary

Solution 3

Here, I'd use perl.

WORD=$word REPLACE=$replace perl -pi -e '
  s/\b\Q$ENV{WORD}\E\b/$ENV{REPLACE}/g' file

sed (even GNU sed) has no equivalent for \Q\E which you need here for the $word not to be taken as a regexp. And most sed implementations don't support -i (or they support it with different syntax) or \b.

\b matches a transition between a word and non-word character.

So \b\Q1.1.2.3\E\b would still match in 1.1.2.3.4 as . is a non-word.

You could also do:

WORD=$word REPLACE=$replace perl -pi -e '
  s/(?<!\S)\Q$ENV{WORD}\E(?!\S)/$ENV{REPLACE}/g' file

To match on $word as long as it's not preceded nor followed by a non-spacing character. (using (?<!) and (?!) negative look behind/forward operators).

Note that perl will by default work with ASCII characters. For instance, a word character would only be _a-zA-Z0-9 (\b\Q1.2.3\E\b would match in 1.2.3é and \S would match individual bytes of an extended unicode spacing characters). For non-ASCII data, you'd probably want to add the -CLSD option to perl.

Some examples:

$ export WORD=1.1.1.3 REPLACE=REPLACE
$ printf '1.1.1.3-x 1.1.1.3\u2006 1.1.1.3.4 1.1.123 1.1.1.3\u20dd 1.1.1.3\ue9\n' > f
$ cat f
1.1.1.3-x 1.1.1.3  1.1.1.3.4 1.1.123 1.1.1.3⃝ 1.1.1.3é
$ perl -pe 's/\b\Q$ENV{WORD}\E\b/$ENV{REPLACE}/g' f
REPLACE-x REPLACE  REPLACE.4 1.1.123 REPLACE⃝ REPLACEé
$ perl -CLSD -pe 's/\b\Q$ENV{WORD}\E\b/$ENV{REPLACE}/g' f
REPLACE-x REPLACE  REPLACE.4 1.1.123 1.1.1.3⃝ 1.1.1.3é
$ perl -pe 's/(?<!\S)\Q$ENV{WORD}\E(?!\S)/$ENV{REPLACE}/g' f
1.1.1.3-x 1.1.1.3  1.1.1.3.4 1.1.123 1.1.1.3⃝ 1.1.1.3é
$ perl -CLSD -pe 's/(?<!\S)\Q$ENV{WORD}\E(?!\S)/$ENV{REPLACE}/g' f
1.1.1.3-x REPLACE  1.1.1.3.4 1.1.123 1.1.1.3⃝ 1.1.1.3é

$ sed "s/\b$WORD\b/$REPLACE/g" f
REPLACE-x REPLACE  REPLACE.4 REPLACE REPLACE⃝ 1.1.1.3é
Share:
13,082

Related videos on Youtube

dilshan
Author by

dilshan

Updated on September 18, 2022

Comments

  • dilshan
    dilshan over 1 year

    My script is:

    n="y"
    while [ "{n}" = "y" ]
    if [ $n == "n" ];
    then
      break;
    fi
    echo "n is $n"
    do
            read -p "Enter the word to find = " word
            read -p "Enter word to replace = " replace
            echo "$word n $replace"
            #sed -i r.text.bak 's/$word/$replace/g' r.txt
            sed -i "s/$word/$replace/g" "test.txt"
    echo "do you have further replacement? n or y"
    read temp
    n=$temp
    done
    

    My problem is that I am also replacing partial matches. For example, for a line like this:

    1.1.1.14 1.1.1.14567
    

    I get this output:

    1.1.1.3  1.1.1.3567
    

    but I expected:

    1.1.1.3 1.1.1.14567
    

    How can I solve this?

    • Admin
      Admin about 9 years
      Is the second line "{n}" = "y" or "${n}" = "y"?
    • Admin
      Admin about 9 years
      ohh great. it's worked well.
    • Admin
      Admin about 9 years
      sed -i "s/(^\| )$word( \|$)/\\1$replace\1/g" "test.txt" and i tried with this. its also worked. thanks a lot hatter for quick solution from you. it is the easiest way.
    • Admin
      Admin about 9 years
      @Costas please don't answer questions in comments, that way the question will never be marked as answered and will just stay there.
    • Admin
      Admin over 8 years
      Note that it's s/regex/replacement/, it's not s/string/replacement/. For instance 1.1.1.3 matches 1.1.1.3 but also 1.1.123 (as . matches any character).