R max function ignore NA

r max
77,172

Solution 1

It seems that the problem has been pointed out in the comments already. Since some vectors contain only NAs, -Inf is reported, which I take from the comments you don't like. In this answer I would like to point out one possible way to tackle the issue, namely to built in a control statement (instead of overwritting -Inf after the fact, which is equally valid). For instance,

 my.max <- function(x) ifelse( !all(is.na(x)), max(x, na.rm=T), NA)

does this trick. If every (all) element in x is NA, then NA is returned, and the max otherwise. If you want any other value returned, just exchange NA for that value. You can also built this easily into your apply-function. E.g.

 maindata$max_pc_age <- apply(maindata[,c(paste("Q2",1:18,sep="_"))], 1, my.max)

I am still sometimes confused by R's NA and empty set treatment. Statements like test <- NA; test==NA will give NA as a result (instead of TRUE, as returned by is.na(test)), which is sometimes rationalized by saying that since the value is missing, how could you know that these two missing values are identical? In this case, however, max returns -Inf since it is given an empty set, which I think is not at all obvious. My experience is though that if strange and unexpected results pop up, NAs or empty sets are often involved.

Solution 2

In cases like below:

df[2,2] <- NA
df[1,2] <- -5

apply(df, 1, function(x) max(x[x != 9],na.rm=TRUE))
#[1]    5 -Inf    7
#Warning message:
#In max(x[x != 9], na.rm = TRUE) :
#  no non-missing arguments to max; returning -Inf

You could do:

df1 <- df  
minVal <- min(df1[!is.na(df1)])-1

df1[is.na(df1)|df1==9] <- minVal
val <- do.call(`pmax`, df1)
val[val==minVal] <- NA
val
#[1]  5 NA  7

Solution 3

You can use hablar::max_ which returns NA if all values are NA

apply(df, 1, function(x) hablar::max_(x[x!=9]))
#[1]  5 NA  7

data

df <- structure(list(age = c(5, NA, 9), marks = c(-5, NA, 7), story = c(2, 
9, NA)), row.names = c(NA, -3L), class = "data.frame")

df
#  age marks story
#1   5    -5     2
#2  NA    NA     9
#3   9     7    NA
Share:
77,172
user2543622
Author by

user2543622

Updated on July 09, 2022

Comments

  • user2543622
    user2543622 almost 2 years

    I have below working code. When i replicate same things on a different data set i get errors :(

    #max by values
    df <- data.frame(age=c(5,NA,9), marks=c(1,2,7), story=c(2,9,NA))
    df
    
    df$colMax <- apply(df[,1:3], 1, function(x) max(x[x != 9],na.rm=TRUE))
    df
    

    I tried to do the same on a bigger data and I am getting warnings, why?

    maindata$max_pc_age <- apply(maindata[,c(paste("Q2",1:18,sep="_"))], 1, function(x) max(x[x != 9],na.rm=TRUE))
    
    
    50: In max(x[x != 9], na.rm = TRUE) :
      no non-missing arguments to max; returning -Inf
    

    in order to understand the problem better I made changes as below, but still getting warnings

    maindata$max_pc_age <- apply(maindata[,c(paste("Q2",1:18,sep="_"))], 1, function(x) max(x,na.rm=TRUE))
    1: In max(x, na.rm = TRUE) : no non-missing arguments to max; returning -Inf
    
  • AdamO
    AdamO over 6 years
    +1 for pmax/pmin, although better methods could be developed when only one unlabeled argument is passed, precluding all this do.call business. You can overload it to make na.rm=T the default, or you can say, do.call(pmax, c(df1, list(na.rm=T)).