How to match whitespace in sed?
Solution 1
The character class \s
will match the whitespace characters <tab>
and <space>
.
For example:
$ sed -e "s/\s\{3,\}/ /g" inputFile
will substitute every sequence of at least 3 whitespaces with two spaces.
REMARK:
For POSIX compliance, use the character class [[:space:]]
instead of \s
, since the latter is a GNU sed extension. See the POSIX specifications for sed and BREs
Solution 2
This works on MacOS 10.8:
sed -E "s/[[:space:]]+/ /g"
Solution 3
sed 's/[ \t]*/"space or tab"/'
Solution 4
Some older versions of sed may not recognize \s as a white space matching token. In that case you can match a sequence of one or more spaces and tabs with '[XZ][XZ]*' where X is a space and Z is a tab.
Related videos on Youtube
Maksim Kondratyuk
Currently working as Doctoral Student in the Speech Group of the Department of Signal Processing and Acoustics of the Aalto Univerity School of Electrical Engineering (formerly TKK / Helsinki University of Technology) in Helsinki, Finland.
Updated on September 17, 2022Comments
-
Maksim Kondratyuk over 1 year
How can I match whitespace in sed? In my data I want to match all of 3+ subsequent whitespace characters (tab space) and replace them by 2 spaces. How can this be done?
-
Marnix A. van Ammers about 14 yearsSo for the particular need here, with an older sed, you could do: $ sed 's/[XZ][XZ][XZ][XZ]*/ /g' inputfile where X is a tab and Z is a space.
-
DeboraThaise over 12 yearsaha! It was the missing -e switch that got me.
-
HUB about 12 yearsI also had to add '-r' switch which enables extended regex's to make sed recognize '\s' as space.
-
Jared Beck almost 11 yearsWith Apple's
sed
I had to use[[:space:]]
because\s
did not work for me. Perhaps\s
is a GNU sed extension? -
Karthik T over 10 years@JaredBeck thanks, was running out of ideas why my simple regex wasnt working.. This is lame, I thought \s was standard extended regex.. Also -r doesnt work and -E did squat
-
bpa over 10 yearsThanks for the feedback. I updated the answer with links to the POSIX standard.
-
amphibient over 10 yearsdo you know if this works on all Linux distros ?
-
Brad Koch about 10 yearsNot generally, GNU sed won't have -E. From the BSD sed man page: "The -E, -a and -i options are non-standard FreeBSD extensions and may not be available on other operating systems."
-
Mokubai almost 10 yearsIs this guaranteed to work on any version of
sed
on any system? If not it might be worth mentioning where this does work in a similar fashion as the other answers, just so we know the limitations and where this might not have the intended result. -
Darren Cook almost 10 yearsFor me
-e
stopped it working, but-r
made it work (Mint 16). I.e. changing fromsed -e -r
tosed -r
was what I needed to do. However I was using[[:space:]]
by this point, as I couldn't get\s
to work. -
Nate over 9 yearsThis RE is what I use to match whitespace. It is simpler than character classes just to match tab or space. It uses only the most basic conventions of regular expressions, so it should work anywhere with a functional implementation of regular expressions.
-
Samuel about 9 yearsWhy do you need the -E flag, for the + operator? Most expressions would probably be fine with * instead, then this would work on other platforms.
-
Alien Life Form almost 9 yearsOn Mac 10.9.5 this matches for spaces and 't'. I used Michael Douma's above to match whitespace chars (it also works with -e).
-
Mancika over 8 years@Samuel If you use *, the regex will match zero or more spaces, and you will get a space between every character, and a space at each end of each line. If you don't have the -E flag, then you want
sed "s/[[:space:]]\+/ /g"
to match one or more spaces. -
Mancika over 8 yearsDoesn't work sensibly on my SUSE system. It matches the first place on the line where there is zero or more spaces, which is before the first character. I doubt that is the intended function, and certainly wasn't the requested use case. I believe you want to change the '*' for '\+' (or '\{3,\}' per the question) and maybe put a g at the end of the sed command to match all occurrences of the pattern. Replacing [ \t] with [[:space:]] may also be desirable as well, in case there is something else for whitespace in the line.
-
Witiko over 7 yearsMuch like the POSIX
[:space:]
character class,\s
will not only match<tab>
and<space>
, but also the<newline>
character (trysed 'N;s/\s/x/' <<<$'aaa\nbbb'
in bash). -
jarno over 7 yearsGNU sed manual does not list \s as a GNU extension.
-
stefanct over 6 yearsInstead of
[[:space:]
one could use[[:blank:]]
which does match horizontal tabs and spaces only (but no newlines, vertical tabs etc.). -
mcandre over 6 yearsFWIW, NetBSD's sed supports the
-E
flag as well. -
xuhdev about 6 years@BradKoch The fact that
-E
is non-standard does not imply GNU sed does not have that option. You linked document exactly states the availability of-E
option for GNU sed as well. -
Brad Koch about 6 years@xuhdev You're correct, GNU sed added support for
-E
in version 4.3, released in 2017. Older versions will still fail with-E
. -
xuhdev about 6 years@BradKoch OK, I think I know what is confusing. Older versions already support
-E
but it is not documented. It was documented later since it seems that-E
is coming to POSIX standard. See unix.stackexchange.com/a/310454/38242 -
bobpaul about 5 yearsFor curious readers: GNU sed has had -r since as long as I can remember (prior to 2004 switch to git). -E was added as an undocumented alias to -r in Aug 2006 (rev 3a8e165). They documented -E in Oct 2013 (rev 8b65e079, prior to v4.1; they didn't git tag prior releases). All v4.3 added w/re to -E was examples in the HTML documentation. Regardless, any GNU sed running in 2010 shouldn't have had any problems with -E, but it was undocumented at the time... git://git.sv.gnu.org/sed
-
NeilG almost 5 yearsOn my platforms -e is optional
-
Jerry Green over 3 yearsdoesn't work on my macos Catalina
-
NYCeyes almost 3 yearsBut how do you specify
\s
in thedestination part
(i.e. thereplace-with
) part of the regular expression? I want to avoid using keyboardspaces
and/ortabs
there, as well.