How does "gsub" handle spaces?

10,319

Solution 1

You can use lookbehind matching like this:

gsub("(?<=\\s)i+", " ", "akui i ii", perl=T)

Edit: lookbehind is still the way to go, demonstrated with an other example from your original post. Hope this helps.

Solution 2

[\sb,\sc] means "one character among space, b, ,, space, c". You probably want something like (\sb|\sc), which means "space followed by b, or space followed by c" or \s[bc] which means "space followed by b or c".

s <- "ab b cde"
gsub( "(\\sb|\\sc)",     "  ", s, perl=TRUE )
gsub( "\\s[bc]",         "  ", s, perl=TRUE )
gsub( "[[:space:]][bc]", "  ", s, perl=TRUE )  # No backslashes

To remove multiple instances of a letter (as in the second example) include a + after the letter to be removed.

s2 <- "akui i ii"
gsub("\\si+", " ", s2)

Solution 3

There is a simple solution to this.

    gsub("\\s[bc]", " ", "ab b cde", perl=T)

This will give you what you want.

Share:
10,319
user702432
Author by

user702432

Updated on June 19, 2022

Comments

  • user702432
    user702432 almost 2 years

    I have a character string "ab b cde", i.e. "ab[space]b[space]cde". I want to replace "space-b" and "space-c" with blank spaces, so that the output string is "ab[space][space][space][space]de". I can't figure out how to get rid of the second "b" without deleting the first one. I have tried:

    gsub("[\\sb,\\sc]", " ", "ab b cde", perl=T)
    

    but this is giving me "a[spaces]de". Any pointers? Thanks.

    Edit: Consider a more complicated problem: I want to convert the string "akui i ii" i.e. "akui[space]i[space]ii" to "akui[spaces|" by removing the "space-i" and "space-ii".

  • user702432
    user702432 about 12 years
    jupp0r... The code doesn't seem to work for "akui i ii" which I included into the original posting. Thanks.
  • user702432
    user702432 about 12 years
    Thanks, Vincent. But it doesn't work for "akui i ii" (Please see my edit to the original post).