Escape dollar sign in regexp for sed

15,014

Solution 1

There are other problems with your script, but file names containing $ are not a problem if you properly quote the argument to rm in the resulting script.

echo "rm -f '$i'" >> REMOVEOLDFILES.sh

or using printf, which makes quoting a little nicer and is more portable:

printf "rm -f '%s'" "$i" >> REMOVEOLDFILES.sh

(Note that I'm addressing the real problem, not necessarily the question you asked.)

Solution 2

The correct way to escape a dollar sign in regular expressions for sed is double-backslash. Then, for creating the escaped version in the output, we need some additional slashes:

cat filenames.txt | sed "s/\\$/\\\\$/g" > escaped-filenames.txt

Yep, that's four backslashes in a row. This creates the required changes: a filename like bla$1$2.class would then change to bla\$1\$2.class. This I can then insert into the full pipeline:

for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g" | sed "s/\\$/\\\\$/g"; do echo "rm -f $i" >> REMOVEOLDFILES.sh; done

Alternative to solve the background problem

chepner posted an alternative to solve the backround problem by simply adding single-quotes around the filenames for the output. This way, the $-signs are not read as variables by bash when executing the script and the files are also properly removed:

for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g"); do echo "rm -f '$i'" >> REMOVEOLDFILES.sh; done

(note the changed echo "rm -f '$i'" in that line)

Share:
15,014

Related videos on Youtube

Captain Ahab
Author by

Captain Ahab

B.Sc. Medical Informatics (University of Heidelberg, Germany) M.Sc. Biomedical Computing (Technical University of Munich, Germany) Visiting Graduate Student at Laboratory for Computational Sensing and Robotics (Johns Hopkins University, Baltimore, MD, USA)

Updated on September 16, 2022

Comments

  • Captain Ahab
    Captain Ahab over 1 year

    I will introduce what my question is about before actually asking - feel free to skip this section!

    Some background info about my setup

    To update files manually in a software system, I am creating a bash script to remove all files that are not present in the new version, using diff:

    for i in $(diff -r old new 2>/dev/null | grep "Only in old" | cut -d "/" -f 3- | sed "s/: /\//g"); do echo "rm -f $i" >> REMOVEOLDFILES.sh; done
    

    This works fine. However, apparently my files often have a dollar sign ($) in the filename, this is due to some permutations of the GWT framework. Here is one example line from the above created bash script:

    rm -f var/lib/tomcat7/webapps/ROOT/WEB-INF/classes/ExampleFile$3$1$1$1$2$1$1.class
    

    Executing this script would not remove the wanted files, because bash reads these as argument variables. Hence I have to escape the dollar signs with "\$".

    My actual question

    I now want to add a sed-Command in the aforementioned pipeline, replacing this dollar sign. As a matter of fact, sed also reads the dollar sign as special character for regular expressions, so obviously I have to escape it as well. But somehow this doesn't work and I could not find an explanation after googling a lot.

    Here are some variations I have tried:

    echo "Bla$bla" | sed "s/\$/2/g"        # Output: Bla2
    echo "Bla$bla" | sed 's/$$/2/g'        # Output: Bla
    echo "Bla$bla" | sed 's/\\$/2/g'       # Output: Bla
    echo "Bla$bla" | sed 's/@"\$"/2/g'     # Output: Bla
    echo "Bla$bla" | sed 's/\\\$/2/g'      # Output: Bla
    

    The desired output in this example should be "Bla2bla". What am I missing? I am using GNU sed 4.2.2

    EDIT

    I just realized, that the above example is wrong to begin with - the echo command already interprets the $ as a variable and the following sed doesn't get it anyway... Here a proper example:

    1. Create a textfile test with the content bla$bla
    2. cat test gives bla$bla
    3. cat test | sed "s/$/2/g" gives bla$bla2
    4. cat test | sed "s/\$/2/g" gives bla$bla2
    5. cat test | sed "s/\\$/2/g" gives bla2bla

    Hence, the last version is the answer. Remember: when testing, first make sure your test is correct, before you question the test object........

  • Captain Ahab
    Captain Ahab about 8 years
    Thanks for the good idea and also for addressing my actual problem :-) Unfortunately bash will still read something like "$1" as arguments, even if it is within quotes. So this doesn't help...
  • chepner
    chepner about 8 years
    I'm not sure where that would happen. There are no dollar signs in text exposed to the shell, only in the output of the pipeline. In all the attempts you posted, you would need echo 'Bla$bla' | sed ... so that $bla isn't expanded before echo even runs, but you should not need to process the output of the initial pipeline.
  • Captain Ahab
    Captain Ahab about 8 years
    Maybe it's a misunderstanding - the files have $-signs in their name and surely need to be listed in the output bash script. With your version, they are just inside quotes. But when I execute that script (to really delete the files), all un-escaped $ signs are read as variable - since the script is run without arguments, they just expand to an empty string. Then the files "bla$1.class" and "bla$1$2.class" will both be translated to "bla.class" for the rm command
  • chepner
    chepner about 8 years
    Oh, sorry, right. Just change the double-quotes to single-quotes in the output to REMOVEOLDFILE.sh.
  • tripleee
    tripleee over 6 years
    No -- the correct way to escape a dollar sign in a double-quoted string in the shell is to put two backslashes. In single quotes, a single backslash is correct and sufficient, and two backslashes is wrong. Generally, unless you require the shell to interpolate variables and execute command substitutions, use single quotes. Use single quotes always when you can.