Replacing the specific values in columns of data frame using gsub in R

33,203

Solution 1

Just remove the character class and add .* next to that group. sub alone would do this job.

df$value <- sub("^(DEL|INS).*", "", df$value)

Inside a character class, each char would be treated speartely not as a whole string. So [DEL] would match a single character from the given list, it may be D or E or L .

Solution 2

First letter is not digital:

df$value <- gsub("^\\D.*", "", df$value)

Or there is '-' in delete value:

df$value <- gsub(".*-.*", "", df$value)
Share:
33,203
Carol
Author by

Carol

Updated on August 18, 2020

Comments

  • Carol
    Carol over 3 years

    I have data.frame as follows

    > df
    ID      Value
    A_001   DEL-1:7:35-8_1 
    A_002   INS-4l:5_74:d
    B_023   0 
    C_891   2
    D_787   8
    E_865   DEL-3:65:1s:b
    

    I would like replace all the values in the column Value that starts with DEL and INS with nothing. I mean i would like get the output as follows

    > df
    ID      Value
    A_001   
    A_002   
    B_023   0 
    C_891   2
    D_787   8
    E_865   
    

    I tried to achieve this using gsub in R using following code but it didnt work

    gsub(pattern="(^([DEL|INS]*)",replacement="",df)
    

    Could anyone guide me how to achieve the desired output.

    Thanks in advance.