Replace specific characters within strings
Solution 1
With a regular expression and the function gsub()
:
group <- c("12357e", "12575e", "197e18", "e18947")
group
[1] "12357e" "12575e" "197e18" "e18947"
gsub("e", "", group)
[1] "12357" "12575" "19718" "18947"
What gsub
does here is to replace each occurrence of "e"
with an empty string ""
.
See ?regexp
or gsub
for more help.
Solution 2
Regular expressions are your friends:
R> ## also adds missing ')' and sets column name
R> group<-data.frame(group=c("12357e", "12575e", "197e18", "e18947")) )
R> group
group
1 12357e
2 12575e
3 197e18
4 e18947
Now use gsub()
with the simplest possible replacement pattern: empty string:
R> group$groupNoE <- gsub("e", "", group$group)
R> group
group groupNoE
1 12357e 12357
2 12575e 12575
3 197e18 19718
4 e18947 18947
R>
Solution 3
Summarizing 2 ways to replace strings:
group<-data.frame(group=c("12357e", "12575e", "197e18", "e18947"))
1) Use gsub
group$group.no.e <- gsub("e", "", group$group)
2) Use the stringr
package
group$group.no.e <- str_replace_all(group$group, "e", "")
Both will produce the desire output:
group group.no.e
1 12357e 12357
2 12575e 12575
3 197e18 19718
4 e18947 18947
Solution 4
You do not need to create data frame from vector of strings, if you want to replace some characters in it. Regular expressions is good choice for it as it has been already mentioned by @Andrie and @Dirk Eddelbuettel.
Pay attention, if you want to replace special characters, like dots, you should employ full regular expression syntax, as shown in example below:
ctr_names <- c("Czech.Republic","New.Zealand","Great.Britain")
gsub("[.]", " ", ctr_names)
this will produce
[1] "Czech Republic" "New Zealand" "Great Britain"
Solution 5
Use the stringi package:
require(stringi)
group<-data.frame(c("12357e", "12575e", "197e18", "e18947"))
stri_replace_all(group[,1], "", fixed="e")
[1] "12357" "12575" "19718" "18947"
Related videos on Youtube
Luke
Updated on May 29, 2021Comments
-
Luke almost 3 years
I would like to remove specific characters from strings within a vector, similar to the Find and Replace feature in Excel.
Here are the data I start with:
group <- data.frame(c("12357e", "12575e", "197e18", "e18947")
I start with just the first column; I want to produce the second column by removing the
e
's:group group.no.e 12357e 12357 12575e 12575 197e18 19718 e18947 18947
-
dickoa almost 12 yearsAlso...
require(stringr);group$groupNoE <- str_replace(group$group, "e", "")
-
Dirk Eddelbuettel almost 12 yearsWell, I could snicker that "Those who do not understand base functions are doomed to replace them". Exactly what does stringr gain here, besides increasing the number of underscores in your source file?
-
dickoa almost 12 years"stringr is a set of simple wrappers that make R's string functions more consistent, simpler and easier to use" from the author of the package. So if what you say is true (many underscores to wrap base functions...) there is no reason for this package to exist (disclaimer : I mainly use base regex functions but I know that they can be difficult for new users...)
-
Joshua Ulrich almost 12 years@dickoa:
str_replace
wrapssub
, so it will only replace the first occurrence of the pattern. You would need to usestr_replace_all
if you wanted the same behavior asgsub
. -
Rich Scriven about 8 years
fixed = TRUE
would make this faster. -
glaed over 7 years@RichScriven could you shortly elaborate why?
-
mm689 over 7 years
fixed=TRUE
prevents R from using regular expressions, which allow more flexible pattern matching but take time to compute. If all that's needed is removing a single constant string "e", they aren't necessary. -
Megatron over 7 yearsAt the time you had to read the whole page including comments to learn the syntax for stringr, my preferred method, as it was mostly discussed in comments. This solution quickly presents both options, which is why I offered it. My hope was to help other users filter through much like I had to do when I was new to R. I struggled with gsub before finding stringr because it wasn't mentioned in a highly upvoted answer. Again, the objective is not to collect upvotes but try to help new R users out.
-
David Arenburg over 7 yearsIf you find information in other answers/comments which you find useful and like to convert to an answer, you could at least provide some attribution to show where did you get the information from / make the answer a Comminuty Wiki instead of just making it as your own.
-
Megatron over 7 yearsThanks - will keep in mind for next time. Have never made a community wiki before, so didn't know it was an option.
-
Phil_T over 6 yearsOption 2 works great when applied to a column of data in a data frame, without specifying all the values in the column. Obviously option 1 is a repeat, but option 2 works very well, and deserves an up-vote for the added functionality.
-
Matheus Santana about 6 yearsWould
sub("e", "", group)
hold the same result? -
Kamil S Jaron almost 6 yearsYou can just escape them, but you have to escape as well the escape character because it's in quotes :
gsub("\\.", " ", ctr_names)
-
sindri_baldur almost 6 yearswould just replace the first
e
it finds in each element -
Martin over 2 years@Andrie can this approach also be used for item by item removal? The situation I have in mind is to remove the 1st string in vector B (specifies what is to be removed) from the 1st string in vector A (what is getting part of itself removed). And the 2nd string in vector B from the 2nd string in vector A and so on. The assumption is that the vectors are of same length. I was able to perform this only by means of hacky commands. Is there a clean way to do this?
-
Catalyst almost 2 yearsbut the e is still there if we call group again i.e. it's not removing the e from the group dataframe