How to assign NA's using IF statement?

25,151

Solution 1

First, some test data:

set.seed(1); x = dnorm(rnorm(100))/(sample(1:100, 100, replace=TRUE))

Subsetting can be done in the following way:

x[x < .001] = NA
x[x > .1] = NA

Or, you can combine it in one statement:

x[x < .001 | x > .1] = NA

Update: To answer why your code is not working

You're running into problems if it does find an NA in there, so remove them from your for loop, but index them before you run the loop so you can remove them later.

temp = which(x < .001 | x > .1) # Index the values you want to set as NA

Remove the following conditions from your for loop:

if (x[i] > .10 & x[i] <= 1)
  x[i] = NA
if (x[i] <= .001)
  x[i] = NA

Run your for loop, and then use temp to set the values to NA that should be NA.

x[temp] = NA

Hope this helps!

Update 2: Two lines

x[x < .001 | x > .1] = NA
out <- ceiling(x*100)/100

Pretty much the same as AKE's suggestion using floor.

This should get you the same results as your loop.

Solution 2

Instead of using an explicit for loop, you should try to use a vectorized function, such as the very handy ifelse. Here is how to recode the NAs in your example:

> x <- ifelse(x <= 0.001 | x > 0.1, NA, x)

To recode the other values, you could try some "clever" use of cut:

> x <- (cut(x, breaks=seq(0.01, 0.09, 0.01), labels=FALSE) / 100) + 0.01

though there are likely better (and more transparent) ways. The reason for avoiding explicit for loops in R is that they are very inefficient when compared to vectorized alternatives. The R Inferno provides a good discussion of this and other R tricks and tips.

Share:
25,151
mats
Author by

mats

Updated on June 24, 2020

Comments

  • mats
    mats about 4 years

    I want to categorize a vector of values between 0 and 1. Values below .001, and values higher than .10 or of no interest. Therefore I want values in these ranges to be NA.

    When I run the code below I get a warning:

    Error in if (x[i] > 0.001 & x[i] <= 0.01) x[i] = 0.01 :  missing value where TRUE/FALSE needed
    

    How do I fix my code?

    for (i in 1:length(x))
      {
        if (x[i] <= .001)
          x[i] = NA
        if (x[i] > .001 & x[i] <= .01)
          x[i] = .01
        if (x[i] > .01 & x[i] <= .02)
          x[i] = .02
        if (x[i] > .02 & x[i] <= .03)
          x[i] = .03
        if (x[i] > .03 & x[i] <= .04)
          x[i] = .04
        if (x[i] > .04 & x[i] <= .05)
          x[i] = .05
        if (x[i] > .05 & x[i] <= .06)
          x[i] = .06
        if (x[i] > .06 & x[i] <= .07)
          x[i] = .07
        if (x[i] > .07 & x[i] <= .08)
          x[i] = .08
        if (x[i] > .08 & x[i] <= .09)
          x[i] = .09
        if (x[i] > .09 & x[i] <= .10)
          x[i] = .10
        if (x[i] > .10 & x[i] <= 1)
          x[i] = NA
      }
    
  • Assad Ebrahim
    Assad Ebrahim about 12 years
    With the subset function mentioned by @mrdwab, all you have left to do is bin the continuous values into discrete values: x=floor(100*x+1)/100
  • GSee
    GSee about 12 years
    ifelse is handy and easy to read, but it is quite a bit slower than x[x <= 0.001 | x > 0.1] <- NA; x
  • John
    John about 12 years
    +1 for the two lines... It was pretty much exactly what I was about to post.