Replacing HTML tag content using sed

25,816

Solution 1

Try this:

sed -i -e "s/\(<span id=\"unlockedCount\">\)\(<\/span>\)/\1${unlockedCount}\2/g" index.html

Solution 2

What you say you want to do is not what you're telling sed to do.

You want to insert a number into a tag or replace it if present. What you're trying to tell sed to do is to replace a span tag and its contents, if any or a number, with the value of in a shell variable.

You're also employing a lot of complex, annoying and erorr-prone escape sequences which are just not necessary.

Here's what you want:

sed -r -i -e 's|<span id="unlockedCount">([0-9]{0,})</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html

Note the differences:

  • Added -r to turn on extended expressions without which your capture pattern would not work.
  • Used | instead of / as the delimiter for the substitution so that escaping / would not be necessary.
  • Single-quoted the sed expression so that escaping things inside it from the shell would not be necessary.
  • Included the matched span tag in the replacement section so that it would not get deleted.
  • In order to expand the unlockedCount variable, closed the single-quoted expression, then later re-opened it.
  • Omitted cat | which was useless here.

I also used double quotes around the shell variable expansion, because this is good practice but if it contains no spaces this is not really necessary.

It was not, strictly speaking, necessary for me to add -r. Plain old sed will work if you say \([0-9]\{0,\}\), but the idea here was to simplify.

Solution 3

sed -i -e 's%<span id="unlockedCount">([0-9]*)</span\>/'"${unlockedCount}/g" index.html 

I removed the Useless Use of Cat, took out a bunch of unnecessary backslashes, added single quotes around the regex to protect it from shell expansion, and fixed the repetition operator. You might still need to backslash the grouping parentheses; my sed, at least, wants \(...\).

Note the use of single and double quotes next to each other. Single quotes protect against shell expansion, so you can't use them around "${unlockedCount}" where you do want the shell to interpolate the variable.

Share:
25,816
Revell
Author by

Revell

Back-end developer from the Netherlands with mainly PHP, Python and Erlang knowledge. Great interest in TDD and automation.

Updated on August 09, 2022

Comments

  • Revell
    Revell over 1 year

    I'm trying to replace the content of some HTML tags in an HTML page using sed in a bash script. For some reason I'm not getting the proper result as it's not replacing anything. It has to be something very simple/stupid im overlooking, anyone care to help me out?

    HTML to search/replace in:

    Unlocked <span id="unlockedCount"></span>/<span id="totalCount"></span> achievements for <span id="totalPoints"></span> points.
    

    sed command used:

    cat index.html | sed -i -e "s/\<span id\=\"unlockedCount\"\>([0-9]\{0,\})\<\/span\>/${unlockedCount}/g" index.html 
    

    The point of this is to parse the HTML page and update the figures according to some external data. For a first run, the contents of the tags will be empty, after that they will be filled.


    EDIT:

    I ended up using a combination of the answers which resulted in the following code:

    sed -i -e 's|<span id="unlockedCount">\([0-9]\{0,\}\)</span>|<span id="unlockedCount">'"${unlockedCount}"'</span>|g' index.html
    

    Many thanks to @Sorpigal, @tripleee, @classic for the help!

  • Revell
    Revell over 12 years
    -r doesn't seem to be a valid sed command? On Mac OS at least.
  • sorpigal
    sorpigal over 12 years
    In MacOS X the switch to enable extended expressions will be different (probably -E, BSD style). -r is a GNU sed switch.
  • sorpigal
    sorpigal over 12 years
    This will fail after the first time. You need to match [0-9]\{0,\} in between the span tags.
  • classic
    classic over 12 years
    Yes, if it is supposed to replace value in span this need to be corrected