Remove trailing spaces from a file using Windows batch?

24,739

Solution 1

The DosTips RTRIM function that Ben Hocking cites can be used to create a script that can right trim each line in a text file. However, the function is relatively slow.

DosTips user (and moderator) aGerman developed a very efficient right trim algorithm. He implemented the algorithm as a batch "macro" - an interesting concept of storing complex mini scripts in environment variables that can be executed from memory. The macros with arguments are a major discussion topic in and of themselves that is not relevent to this question.

I have extracted aGerman's algorithm and put it in the following batch script. The script expects the name of a text file as the only parameter and proceeds to right trim the spaces off each line in the file.

@echo off
setlocal enableDelayedExpansion
set "spcs= "
for /l %%n in (1 1 12) do set "spcs=!spcs!!spcs!"
findstr /n "^" "%~1" >"%~1.tmp"
setlocal disableDelayedExpansion
(
  for /f "usebackq delims=" %%L in ("%~1.tmp") do (
    set "ln=%%L"
    setlocal enableDelayedExpansion
    set "ln=!ln:*:=!"
    set /a "n=4096"
    for /l %%i in (1 1 13) do (
      if defined ln for %%n in (!n!) do (
        if "!ln:~-%%n!"=="!spcs:~-%%n!" set "ln=!ln:~0,-%%n!"
        set /a "n/=2"
      )
    )
    echo(!ln!
    endlocal
  )
) >"%~1"
del "%~1.tmp" 2>nul

Assuming the script is called rtrimFile.bat, then it can be called from the command line as follows:

rtrimFile "fileName.txt"

A note about performance
The original DosTips rtrim function performs a linear search and defaults to trimming a maximum of 32 spaces. It has to iterate once per space.

aGerman's algorithm uses a binary search and it is able to trim the maximum string size allowed by batch (up to ~8k spaces) in 13 iterations.

Unfotunately, batch is very SLOW when it comes to processing text. Even with the efficient rtrim function, it takes ~70 seconds to trim a 1MB file on my machine. The problem is, just reading and writing the file without any modification takes significant time. This answer uses a FOR loop to read the file, coupled with FINDSTR to prefix each line with the line number so that blank lines are preserved. It toggles delayed expansion to prevent ! from being corrupted, and uses a search and replace operation to remove the line number prefix from each line. All that before it even begins to do the rtrim.

Performance could be nearly doubled by using an alternate file read mechanism that uses set /p. However, the set /p method is limited to ~1k bytes per line, and it strips trailing control characters from each line.

If you need to regularly trim large files, then even a doubling of performance is probably not adequate. Time to download (if possible) any one of many utilities that could process the file in the blink of an eye.

If you can't use non-native software, then you can try VBScript or JScript excecuted via the CSCRIPT batch command. Either one would be MUCH faster.

UPDATE - Fast solution with JREPL.BAT

JREPL.BAT is a regular expression find/replace utility that can very efficiently solve the problem. It is pure script (hybrid batch/JScript) that runs natively on any Windows machine from XP onward. No 3rd party exe files are needed.

With JREPL.BAT somewhere within your PATH, you can strip trailing spaces from file "test.txt" with this simple command:

jrepl " +$" "" /f test.txt /o -

If you put the command within a batch script, then you must precede the command with CALL:

call jrepl " +$" "" /f test.txt /o -

Solution 2

Go get yourself a copy of CygWin or the sed package from GnuWin32.

Then use that with the command:

sed "s/ *$//" inputFile >outputFile

Solution 3

Dos Tips has an implementation of RTrim that works for batch files:

:rTrim string char max -- strips white spaces (or other characters) from the end of a string
::                     -- string [in,out] - string variable to be trimmed
::                     -- char   [in,opt] - character to be trimmed, default is space
::                     -- max    [in,opt] - maximum number of characters to be trimmed from the end, default is 32
:$created 20060101 :$changed 20080219 :$categories StringManipulation
:$source http://www.dostips.com
SETLOCAL ENABLEDELAYEDEXPANSION
call set string=%%%~1%%
set char=%~2
set max=%~3
if "%char%"=="" set char= &rem one space
if "%max%"=="" set max=32
for /l %%a in (1,1,%max%) do if "!string:~-1!"=="%char%" set string=!string:~0,-1!
( ENDLOCAL & REM RETURN VALUES
    IF "%~1" NEQ "" SET %~1=%string%
)
EXIT /b

If you're not used to using functions in batch files, read this.

Solution 4

There is a nice trick to remove trailing spaces based on this answer of user Aacini; I modified it so that all other spaces occurring in the string are preserved. So here is the code:

@echo off
setlocal EnableDelayedExpansion
rem // This is the input string:
set "x=  This is   a text  string     containing  many   spaces.   "
rem // Ensure there is at least one trailing space; then initialise auxiliary variables:
set "y=%x% " & set "wd=" & set "sp="
rem // Now here is the algorithm:
set "y=%y: =" & (if defined wd (set "y=!y!!sp!!wd!" & set "sp= ") else (set "sp=!sp! ")) & set "wd=%"
rem // Return messages:
echo  input: "%x%"
echo output: "%y%"
endlocal

However, this approach fails when a character of the set ^, !, " occurs in the string.

Share:
24,739
HeinrichStack
Author by

HeinrichStack

Updated on July 09, 2022

Comments

  • HeinrichStack
    HeinrichStack 5 months

    How could I trim all trailing spaces from a text file using the Windows command prompt?

    • HeinrichStack
      HeinrichStack almost 11 years
      I forgot to mention, that I would like to this from the command line. Possibly w/o any additional software
    • paxdiablo
      paxdiablo almost 11 years
      I forgot to mention. I would like to do this by using machine language. Preferably without using an assembler or compiler :-) Use the tools you can, that's what they're for. Otherwise you're wasting time re-inventing wheels (and probably making them square as well).
  • HeinrichStack
    HeinrichStack almost 11 years
    Thanks, but I need to call this from the command line. Any suggestions how ?
  • HeinrichStack
    HeinrichStack almost 11 years
    Thanks, but I forgot to mention, that I want to trim the trailing whitespaces without any additional software. Just with the preinstalled files on the OS.
  • Ben Hocking
    Ben Hocking almost 11 years
    @HeinrichStack: Make a batch file that calls this function with its argument…
  • HeinrichStack
    HeinrichStack almost 11 years
    Thank you, I will try this, as soon as Im up to it again.
  • HeinrichStack
    HeinrichStack almost 11 years
    Thanks, it worked. I wish I could understand this script in detail, w/o having to become a batch expert :)
  • HeinrichStack
    HeinrichStack almost 11 years
    PS Just a small feedback on performance, it is trimming a big text file with a rate of little less than 1 MB / sec on a dual intel 2.66 GHz, xp sp3, 2GB RAM. I know the above means almost nothing, but just fyi. For me, if Im to trim a 10MB file, it would mean more than 10 mins ... So, the question is: Could you imagine some limits for the above batch, and some possibility to increase performance? The line set /a "k=4096"%\n% sets some buffer or what is it good for?
  • dbenham
    dbenham almost 11 years
    @HeinrichStack Oops, the set /a k... line was an accidental holdover from aGerman's original code. It was harmless, but not needed - I've eliminated it. I think you have typo/math error - I wish this was as fast as 1MB/sec ;) I get a bit less than 1MB/min. I'll add an addendum about performance.