How to remove a character (asterisk) in column values in r?

12,389

Solution 1

The stringr package has some very handy functions for vectorized string manipulation.

In the following code I replace the * with ''. Note that in R, literals inside the regex have to be preceded by double slashes \\ instead of the usual single slash \.

library(stringr) 
LocationID <- c('*Yukon','*Lewis Rich',  '*Kodiak', 'Kodiak', '*Rays')
AWC <- c(333, 485, 76, 666, 54)
df <- data.frame(LocationID, AWC)

df$location_clean <- stringr::str_replace(df$LocationID, '\\*', '')

Resulting in:

LocationID AWC location_clean
1      *Yukon 333          Yukon
2 *Lewis Rich 485     Lewis Rich
3     *Kodiak  76         Kodiak
4      Kodiak 666         Kodiak
5       *Rays  54           Rays

Solution 2

This can be achieved using the mutate verb from the tidyverse package. Which in my opinion is more readable. So, to exemplify this, I create a dataset called DT with a focus on the LocationID to mimic the problem at hand.

library(tidyverse)
DT <- data.frame('AWC'= c(333, 485, 76, 666, 54), 
                 'LocationID'= c('*Yukon','*Lewis Rich', '*Kodiak', 'Kodiak', '*Rays'))

head(DT)
  AWC  LocationID
1 333      *Yukon
2 485 *Lewis Rich
3  76     *Kodiak
4 666      Kodiak
5  54       *Rays

In what follows, mutate allows one to alter the column content, gsub does the desired substitution (of * with ""), keeping the data cleaning flow followable.

DT <- DT %>% mutate(LocationID = gsub("\\*", "", LocationID))
head(DT)
  AWC LocationID
1 333      Yukon
2 485 Lewis Rich
3  76     Kodiak
4 666     Kodiak
5  54       Rays

NOTE that \\ is placed before * as the escape character

Solution 3

use gsub and escape character \ because * is a special charachter to basically replace * with nothing"" (thus deleting it)

> so
  AWC   LocationID
1 333       *Yukon
2 485  *Lewis Rich
3  76      *Kodiak
4 666       Kodiak
5  54        *Rays


> so$LocationID=gsub("\\*","",so$LocationID)
> so
  AWC  LocationID
1 333       Yukon
2 485  Lewis Rich
3  76      Kodiak
4 666      Kodiak
5  54        Rays
Share:
12,389
Juliet R
Author by

Juliet R

Updated on June 12, 2022

Comments

  • Juliet R
    Juliet R almost 2 years

    so I have a dataframe that looks like this but has 6k rows:

    AWC, LocationID
    333, *Yukon
    485, *Lewis Rich
    76, *Kodiak
    666, Kodiak
    54, *Rays
    

    I would like to remove the asterisks from the LocationID values if thats possible and just keep the original name. So *Yukon -> Yukon. If thats not possible, could you help me with a way to rename a column value? I'm new to r.