Replace all NA with FALSE in selected columns in R
Solution 1
If you want to do the replacement for a subset of variables, you can still use the is.na(*) <-
trick, as follows:
df[c("x1", "x2")][is.na(df[c("x1", "x2")])] <- FALSE
IMO using temporary variables makes the logic easier to follow:
vars.to.replace <- c("x1", "x2")
df2 <- df[vars.to.replace]
df2[is.na(df2)] <- FALSE
df[vars.to.replace] <- df2
Solution 2
tidyr::replace_na
excellent function.
df %>%
replace_na(list(x1 = FALSE, x2 = FALSE))
This is such a great quick fix. the only trick is you make a list of the columns you want to change.
Solution 3
Try this code:
df <- data.frame(
id = c(rep(1:19), NA),
x1 = sample(c(NA, TRUE), 20, replace = TRUE),
x2 = sample(c(NA, TRUE), 20, replace = TRUE)
)
replace(df, is.na(df), FALSE)
UPDATED for an another solution.
df2 <- df <- data.frame(
id = c(rep(1:19), NA),
x1 = sample(c(NA, TRUE), 20, replace = TRUE),
x2 = sample(c(NA, TRUE), 20, replace = TRUE)
)
df2[names(df) == "id"] <- FALSE
df2[names(df) != "id"] <- TRUE
replace(df, is.na(df) & df2, FALSE)
Solution 4
You can use the NAToUnknown
function in the gdata
package
df[,c('x1', 'x2')] = gdata::NAToUnknown(df[,c('x1', 'x2')], unknown = 'FALSE')
Solution 5
With dplyr
you could also do
df %>% mutate_each(funs(replace(., is.na(.), F)), x1, x2)
It is a bit less readable compared to just using replace()
but more generic as it allows to select the columns to be transformed. This solution especially applies if you want to keep NAs in some columns but want to get rid of NAs in others.
Related videos on Youtube
lokheart
Updated on September 13, 2020Comments
-
lokheart over 3 years
I have a question similar to this one, but my dataset is a bit bigger: 50 columns with 1 column as UID and other columns carrying either
TRUE
orNA
, I want to change all theNA
toFALSE
, but I don't want to use explicit loop.Can
plyr
do the trick? Thanks.UPDATE #1
Thanks for quick reply, but what if my dataset is like below:
df <- data.frame( id = c(rep(1:19),NA), x1 = sample(c(NA,TRUE), 20, replace = TRUE), x2 = sample(c(NA,TRUE), 20, replace = TRUE) )
I only want
X1
andX2
to be processed, how can this be done? -
Jubbles about 12 yearsExcellent function except for one snag - if I want to change unknowns to 0, and I already have some NAs and zeroes in the vector, then I receive the error message
Error in NAToUnknown.default(x = dots[[1L]][[1L]], unknown = dots[[2L]][[1L]], : 'x' already has value “0”
. -
tmakino about 11 yearsI know this is an old post, but would you explain the first line to me? I get the logic when you break it down using temp variables, but I'd like to understand the one line form. I thought I was familiar with subsetting but I don't understand the [][]. I searched "double brackets" but that turned up something different.
-
blakeoft over 9 years@tmakino You just have to read the double brackets as different subsets from left to right. For example, if
x <- 1:10
, thenx[5:10][1:4]
will give you the vector5 6 7 8
. In multiple steps, you could take the first subset and call it y,y <- x[5:10]
which is5 6 7 8 9 10
. And then subset that vectory[1:4]
, which gives you5 6 7 8
again. -
coip about 9 yearsYou can also use the column position instead of explicitly naming them, which is useful when you have a lot of variables to convert or if they have long names:
df2[,14:16][is.na(df2[,14:16])] <- 0
, for instance, replacesNA
with0
in columns 14, 15, and 16 of data frame, df2.