How to remove trailing whitespaces for multiple files?

39,832

Solution 1

You want

sed --in-place 's/[[:space:]]\+$//' file

That will delete all POSIX standard defined whitespace characters, including vertical tab and form feed. Also, it will only do a replacement if the trailing whitespace actually exists, unlike the other answers that use the zero or more matcher (*).

--in-place is simply the long form of -i. I prefer to use the long form in scripts because it tends to be more illustrative of what the flag actually does.

It can be easily integrated with find like so:

find . -type f -name '*.txt' -exec sed --in-place 's/[[:space:]]\+$//' {} \+

If you're on a Mac

As pointed out in the comments, the above doesn't work if you don't have gnu tools installed. If that's the case, you can use the following:

find . -iname '*.txt' -type f -exec sed -i '' 's/[[:space:]]\{1,\}$//' {} \+

Solution 2

Unlike other solutions which all require GNU sed, this one should work on any Unix system implementing POSIX standard commands.

find . -type f -name "*.txt" -exec sh -c 'for i;do sed 's/[[:space:]]*$//' "$i">/tmp/.$$ && mv /tmp/.$$ "$i";done' arg0 {} +

Edit: this slightly modified version preserves the files permissions:

find . -type f -name "*.txt" -exec sh -c 'for i;do sed 's/[[:space:]]*$//' "$i">/tmp/.$$ && cat /tmp/.$$ > "$i";done' arg0 {} +

Solution 3

I've been using this to fix whitespace:

while IFS= read -r -d '' -u 9
do
    if [[ "$(file -bs --mime-type -- "$REPLY")" = text/* ]]
    then
        sed -i -e 's/[ \t]\+\(\r\?\)$/\1/;$a\' -- "$REPLY"
    else
        echo "Skipping $REPLY" >&2
    fi
done 9< <(find . \( -type d -regex '^.*/\.\(git\|svn\|hg\)$' -prune -false \) -o -type f -print0)

Features:

  • Keeps carriage returns (unlike [:space:]), so it works fine on Windows/DOS-style files.
  • Only worries about "normal" whitespace - If you have vertical tabs or such in your files it's probably intentional (test code or raw data).
  • Skips the .git and .svn VCS directories.
  • Only modifies files which file thinks is a text file.
  • Reports all paths which were skipped.
  • Works with any filename.

Solution 4

How about this:

sed -e -i 's/[ \t]*$//'

Btw, this is a handy site: http://sed.sourceforge.net/sed1line.txt

Solution 5

For those that are not sed gurus (myself included) I have created a small script to use JavaScript regular expressions to replace text in files and does the replacement in place:

http://git.io/pofQnQ

To remove trailing whitespace you can use it as such:

$ node sed.js "/^[\t ]*$/gm" "" file

Enjoy

Share:
39,832
Mikko Ohtamaa
Author by

Mikko Ohtamaa

Building Trading Strategy, a decentralised algorithmic trading protocol

Updated on July 05, 2022

Comments

  • Mikko Ohtamaa
    Mikko Ohtamaa almost 2 years

    Are there any tools / UNIX single liners which would remove trailing whitespaces for multiple files in-place.

    E.g. one that could be used in the conjunction with find.

  • Mikko Ohtamaa
    Mikko Ohtamaa about 12 years
    By the way what's the thing with \+ as find exec terminator?
  • Tim Pote
    Tim Pote about 12 years
    There are two variants of the find -exec command. The first ends with ;. It runs command once for every file find returns. The second ends with +. It runs command as few times as possible by building up a list of files to run command on. Since the ; variant requires a backslash to escape the ;, I also generally put it on the + as well (though I don't think it's strictly necessary for the +).
  • Mikko Ohtamaa
    Mikko Ohtamaa about 12 years
    Just to be on the safe side: you probably want to ignore all . files for automatic processing like this (Eclipse .metadata, .bzr, so on)
  • l0b0
    l0b0 about 12 years
    I regularly use dotfiles which should be cleaned up - .bashrc, .gitignore, etc. There's no authority on which files you should always exclude, so it's up to you and the task at hand.
  • seb
    seb over 11 years
    Talking about readability (and it's all a matter of taste) but I never use -exec with find because all that {}+ stuff is like line noise. I prefer find . -type f -name '*.txt' | xargs --replace=FILE sed --in-place 's/foo/baz/' FILE but YMMV :)
  • nacho4d
    nacho4d about 11 years
    This works almost perfect. The only thing is that it changes files permissions to (the default?) 100644.
  • David Oliver
    David Oliver about 11 years
    It looks like this also converts DOS-style line endings to Unix-style.
  • srcspider
    srcspider almost 11 years
    It looks like this also messes file permissions on windows (running from git bash); also the \+ variant doesn't work.
  • amacleod
    amacleod almost 11 years
    On MacOS X, the stock sed does not support long options. I was able to get this recipe working by installing GNU sed with Homebrew (brew install gnu-sed).
  • Michael-O
    Michael-O almost 11 years
    Works flawlessly on BSD!
  • Bob Jarvis - Слава Україні
    Bob Jarvis - Слава Україні over 9 years
    Very helpful on HP-UX.
  • ZPH
    ZPH over 9 years
    On MacOSX, the stock sed will work with the following tweaks find . -type f -name '*.rb' -exec sed -i '' 's/[[:space:]]*$//' {} \+. Note the -i '' and that we've replaced the + with *.
  • Michael Scott Asato Cuthbert
    Michael Scott Asato Cuthbert almost 7 years
    Is this removing trailing spaces from filenames instead?
  • Labo
    Labo over 6 years
    I think it is "s/[ \t]+$//g" in perl.
  • Josiah
    Josiah over 4 years
    -e on my version of sed is for adding a script, but you aren't specifying a script. I'm using GNU sed 4.7.
  • CervEd
    CervEd about 3 years
    the sed keeps carriage returns but appeats to eat newlines at the end of files :(
  • CervEd
    CervEd about 3 years
    my bad, the sed adds a newline at EOF
  • CervEd
    CervEd about 3 years
    I'm finding that sed -i -e 's/[ \t]\+\(\r\?\)$/\1/' (same sed wo. adding newline at EOF) isn't preserving DOS-style endings. Using gnu sed 4.8. Example seq 2 | unix2dos | sed -e 's/[ \t]\+\(\r\?\)$/\1/' | xxd -p outputs 310a320a should be 310d0a320d0a
  • CervEd
    CervEd about 3 years
    with git for windows I had to add the -b option to sed to preserve CLRF. The regex preserved CLRF but sed didn't stackoverflow.com/a/11508669/1507124