Removing leading, trailing and multiple spaces within a string
Solution 1
You can use something like:
s/^\s+|\s+$|\s+(?=\s)//g
\s+(?=\s)
will match all the spaces in the middle of the string and leave one.
Solution 2
In Javascript, the string prototype has two methods that can manage this:
str.trim().replace(/\s+/g, ' ')
str.trim()
will remove leading and trailing spaces
str.replace(regex, replacement)
will return a new string (nondestructive to original str
) where regex
will be compared against the provided string and the first instance of a match will be replaced by replacement
, then the whole new string is returned.
Important thing to note: the first param of .replace
should not be encapsulated with quotes. Regex is delimited with slashes (/regex/
) and then g
is appended to mean replace globally (every matched instance) rather than just replacing the first or next instance based on lastIndex
(which is initially 0, giving the first instance). You can read more about lastIndex
and everything I've mentioned at second link provided.
example:
var str = ' 1 2 3 4 '
function trimReplace(str){
newStr = str.trim().replace(/\s+/g, ' ');
console.log(newStr);
}
trimReplace(str)
Try this in your console:
' 1 2 3 4 '.trim().replace(/\s+/g, ' ')
"1 2 3 4"
_
regex: kleene operators will help you understand the regex used to match multiple spaces
regex: helpful guide on regex and /g flag
Google: MDN string.protoype.trim()
Google: MDN string.prototype.replace()
Solution 3
Using awk
echo " word1 word2 word3 word4 " | awk '{$1=$1}1'
word1 word2 word3 word4
This $1=$1
is a trick to concentrate everything.
You can even use
awk '$1=$1' file
But if first field is 0
or 0.0
it will fail
Solution 4
This might work for you (GNU sed):
sed -r 's/((^)\s*(\S))|((\S)\s*($))|(\s)\s*/\2\3\5\6\7/g' file
or simply:
sed -r 's/(^\s*(\S))|((\S)\s*$)|(\s)\s*/\2\4\5/g file
jkshah
Updated on July 22, 2022Comments
-
jkshah almost 2 years
I would like to remove all leading and trailing spaces. As well as replace multiple spaces with a single space within a string, so that all words in a string are separated exactly by single space.
I could achieve this using following two iteration of regex and looking for single regex solution.
s/^\s+|\s+$//g s/\s+/ /g
Sample Input:
word1 word2 word3 word4
Desired Output:
word1 word2 word3 word4
It would be appreciable if you could help me to solve this.
-
jkshah over 10 yearsworked like charm. Thanks :) Still eager to see other approaches.
-
jkshah over 10 yearsThanks for your response. I am not familiar with
awk
much but would like to try this one. -
jkshah over 10 yearsThrough it took me some time to understand, I got it working. Different capturing approach throwing out unnecessary ones!
-
iruvar over 10 yearsThe sed guru strikes! ;-) +1
-
mpapec over 10 yearsdoes sed have problem with
s/^\s+|\s+$|\s+(?=\s)//g
? -
potong over 10 years@mpapec the first two alternations are regexp's common to sed whereas the last is not.
-
Laurel almost 8 yearsJust a note, while this answer is otherwise good, there is a way to construct a regex without the delimiting slashes. developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/…