How to strip multiple spaces to one using sed?

318,722

Solution 1

The use of grep is redundant, sed can do the same. The problem is in the use of * which also match 0 spaces. With GNU sed, you may use \+ instead:

iostat | sed -n '/hdisk1/s/ \+/ /gp'

Or, with standard sed:

iostat | sed -e '/hdisk/!d' -e 's/ \{2,\}/ /g'

to delete all lines that does not contain the substring hdisk, and to replace all runs of two or more spaces with single spaces, or

iostat | sed -e '/hdisk1/!d' -e 's/   */ /g'

Solution 2

/[ ]*/ matches zero or more spaces, so the empty string between characters matches.

If you're trying to match "one or more spaces", use one of these:

... | sed 's/  */ /g'
... | sed 's/ \{1,\}/ /g'
... | tr -s ' '

Solution 3

Change your * operator to a +. You are matching zero or more of the previous character, which matches every character because everything that isn't a space is ... um ... zero instances of space. You need to match ONE or more. Actually it would be better to match two or more

The bracketed character class is also un-necessary for matching one character. You can just use:

s/  \+/ /g

...unless you want to match tabs or other kinds of spaces too, then the character class is a good idea.

Solution 4

You can always match the last occurrence in a sequence of anything like:

s/\(sequence\)*/\1/

And so you're on the right track, but rather than replacing the sequence with a space - replace it with its last occurrence - a single space. That way if a sequence of spaces is matched then the sequence is reduced to a single space, but if the null string is matched then the null string is replaced with itself - and no harm, no foul. So, for example:

sed 's/\( \)*/\1/g' <<\IN                                    
# iostat
System configuration: lcpu=4 drives=8 paths=2 vdisks=0

tty:      tin         tout    avg-cpu: % user % sys % idle % iowait
          0.2         31.8                9.7   4.9   82.9      2.5

Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
hdisk9           0.2      54.2       1.1   1073456960  436765896
hdisk7           0.2      54.1       1.1   1070600212  435678280
hdisk8           0.0       0.0       0.0          0         0
hdisk6           0.0       0.0       0.0          0         0
hdisk1           0.1       6.3       0.5   63344916  112429672
hdisk0           0.1       5.0       0.2   40967838  98574444
cd0              0.0       0.0       0.0          0         0
hdiskpower1      0.2     108.3       2.3   2144057172  872444176

# iostat | grep hdisk1
hdisk1           0.1       6.3       0.5   63345700  112431123

IN

OUTPUT

# iostat
System configuration: lcpu=4 drives=8 paths=2 vdisks=0

tty: tin tout avg-cpu: % user % sys % idle % iowait
 0.2 31.8 9.7 4.9 82.9 2.5

Disks: % tm_act Kbps tps Kb_read Kb_wrtn
hdisk9 0.2 54.2 1.1 1073456960 436765896
hdisk7 0.2 54.1 1.1 1070600212 435678280
hdisk8 0.0 0.0 0.0 0 0
hdisk6 0.0 0.0 0.0 0 0
hdisk1 0.1 6.3 0.5 63344916 112429672
hdisk0 0.1 5.0 0.2 40967838 98574444
cd0 0.0 0.0 0.0 0 0
hdiskpower1 0.2 108.3 2.3 2144057172 872444176

# iostat | grep hdisk1
hdisk1 0.1 6.3 0.5 63345700 112431123

All that said, it is probably far better to avoid regexps completely in this situation and do instead:

tr -s \  <infile

Solution 5

Notice that you can also do what you attempt, that is

iostat | grep "hdisk1 " | sed -e's/  */ /g' | cut -d" " -f 5

by

iostat | while read disk tma kbps tps re wr; do [ "$disk" = "hdisk1" ] && echo "$re"; done

which might be especially useful if you later attempt to access other fields as well and/or calculate something - like this:

iostat | while read disk tma kbps tps re wr; do [ "$disk" = "hdisk1" ] && echo "$(( re/1024 )) Mb"; done
Share:
318,722
Josh
Author by

Josh

Updated on September 18, 2022

Comments

  • Josh
    Josh over 1 year

    sed on AIX is not doing what I think it should. I'm trying to replace multiple spaces with a single space in the output of IOSTAT:

    # iostat
    System configuration: lcpu=4 drives=8 paths=2 vdisks=0
    
    tty:      tin         tout    avg-cpu: % user % sys % idle % iowait
              0.2         31.8                9.7   4.9   82.9      2.5
    
    Disks:        % tm_act     Kbps      tps    Kb_read   Kb_wrtn
    hdisk9           0.2      54.2       1.1   1073456960  436765896
    hdisk7           0.2      54.1       1.1   1070600212  435678280
    hdisk8           0.0       0.0       0.0          0         0
    hdisk6           0.0       0.0       0.0          0         0
    hdisk1           0.1       6.3       0.5   63344916  112429672
    hdisk0           0.1       5.0       0.2   40967838  98574444
    cd0              0.0       0.0       0.0          0         0
    hdiskpower1      0.2     108.3       2.3   2144057172  872444176
    
    # iostat | grep hdisk1
    hdisk1           0.1       6.3       0.5   63345700  112431123
    
    #iostat|grep "hdisk1"|sed -e"s/[ ]*/ /g"
     h d i s k 1 0 . 1 6 . 3 0 . 5 6 3 3 4 5 8 8 0 1 1 2 4 3 2 3 5 4
    

    sed should search & replace (s) multiple spaces (/[ ]*/) with a single space (/ /) for the entire group (/g)... but it's not only doing that... its spacing each character.

    What am I doing wrong? I know its got to be something simple... AIX 5300-06

    edit: I have another computer that has 10+ hard drives. I'm using this as a parameter to another program for monitoring purposes.

    The problem I ran into was that "awk '{print $5}' didn't work because I'm using $1, etc in the secondary stage and gave errors with the Print command. I was looking for a grep/sed/cut version. What seems to work is:

    iostat | grep "hdisk1 " | sed -e's/  */ /g' | cut -d" " -f 5
    

    The []s were "0 or more" when I thought they meant "just one". Removing the brackets got it working. Three very good answers really quickly make it hard to choose the "answer".

  • Josh
    Josh over 12 years
    AIX doesn't seem to support +.
  • Josh
    Josh over 12 years
    AIX doesn't seem to support +, but removal of the []'s seems to have done the trick.
  • Josh
    Josh over 12 years
    Ahh... [] makes it "optional". That explains it.
  • Angel Todorov
    Angel Todorov over 12 years
    @WernerCD, no * makes it "optional". [ ] just makes a list of characters with only one character in it (a space). It is the quantifier * that means "zero or more of the previous thing"
  • Josh
    Josh over 12 years
    I tried using the sed -n version... what happens is I have another computer that has 10+ drives so it starts doing 1, 10, 11, etc... I tried to add a space /hdisk1 / and it gave me a "not recognized function". what seems to work is >> iostat | grep "hdisk1 " | sed -e's/ */ /g'
  • tcoolspy
    tcoolspy over 12 years
    @WernerCD: Then try s/ */ /g (that's with three spaces, the comment formatting is collapsing them). The star operator will make the previous character optional, so if you to match two or more with it you need to match the first two yourself (two spaces) then add a third space and a star to make the third and following spaces optional.
  • tcoolspy
    tcoolspy over 12 years
    @userunknown: Actually I'm not mixing two things at all, everybody else is :) Replacing a single space with a single space is pointless, you only need to do this action on matches that have at least two sequential spaces. Two blanks and a plus or three blanks and a star are exactly what is needed.
  • Josh
    Josh over 12 years
    Ahh... so to be more accurate, changing it from a single space / */, to a double space is what did it then. I gottcha.
  • Josh
    Josh over 12 years
    Very nice. First version works. My AIX boxes don't seem to like the second one. All three boxes output: "$[ re/1024 ] Mb". The monitoring tool I'm using has conversions for reports so it isn't a "needed" thing for me, but I like it.
  • tcoolspy
    tcoolspy over 12 years
    @userunknown: It's not that big a deal it's just a waste of a little bit of processing time and it throws off things like match counters.
  • rozcietrzewiacz
    rozcietrzewiacz over 12 years
    @enzotib Thanks for correcting the while.
  • rozcietrzewiacz
    rozcietrzewiacz over 12 years
    @WernerCD Ah, this $[ .. ] is probably available in recent versions of bash (maybe zsh too). I updated the answer to a more portable $(( .. )) instead.
  • Josh
    Josh over 12 years
    That did the trick. I'll have to look that up. Snazzy.
  • Wildcard
    Wildcard over 8 years
    +1 for the simplicity of the real answer, iostat | tr -s \
  • Andrejs
    Andrejs over 7 years
    +1 for the simplest tr -s ' ' solution
  • randominstanceOfLivingThing
    randominstanceOfLivingThing over 6 years
    'tr -s \ ' is the same as 'tr -s " "'. Made me realise that space can be passed as a argument in the string by escaping with "\". I see that it can be used in shell scripts as well. Cool application.
  • m3nda
    m3nda about 6 years
    For some reason i got order s uncompleted with sed, while tr -s " " worked. I'll not forget it never :-B because is SO handy. The simpler the better.
  • Rakib Fiha
    Rakib Fiha over 3 years
    Wish the OP asked for tr -s ' ' solution. Such a simple and powerful command.
  • Timo
    Timo over 3 years
    You can also use \s for whitespace which then is iostat | sed -n '/hdisk1/s/\s\+/ /gp'
  • Timo
    Timo over 3 years
    iostat | sed -n '/hdisk1/s/ */ /gp' can you explain what this command does and why you use exactly two whitespaces together with asterisk. I know 's/search/replace/' and g means global and p is print. n is supress (the rest excluding the line to print, which should be printed)
  • Brad Parks
    Brad Parks over 3 years
    updated it to hopefully fix it!
  • RobbieTheK
    RobbieTheK about 3 years
    how do you pass a file name to this, @brad-parks?
  • Brad Parks
    Brad Parks about 3 years
    @RobbieTheK - you'd have to do something like, cat FILENAME | compress_spaces.sh