Function to change blanks to NA

26,596

Solution 1

You can directly index fields that match a logical criterion. So you can just write:

df[is_empty(df)] = NA

Where is_empty is your comparison, e.g. df == "":

df[df == ""] = NA

But note that is.null(df) won’t work, and would be weird anyway1. I would advise against merging the logic for columns of different types, though! Instead, handle them separately.


1 You’ll almost never encounter NULL inside a table since that only works if the underlying vector is a list. You can create matrices and data.frames with this constraint, but then is.null(df) will never be TRUE because the NULL values are wrapped inside the list).

Solution 2

How about just:

df[apply(df, 2, function(x) x=="")] = NA

Works fine for me, at least on simple examples.

Solution 3

This worked for me

    df[df == 'NULL'] <- NA
Share:
26,596
Travis Heeter
Author by

Travis Heeter

Full Stack Software Engineer.

Updated on July 13, 2022

Comments

  • Travis Heeter
    Travis Heeter almost 2 years

    I'm trying to write a function that turns empty strings into NA. A summary of one of my column looks like this:

          a   b 
     12 210 468 
    

    I'd like to change the 12 empty values to NA. I also have a few other factor columns for which I'd like to change empty values to NA, so I borrowed some stuff from here and there to come up with this:

    # change nulls to NAs
    nullToNA <- function(df){
    
      # split df into numeric & non-numeric functions
      a<-df[,sapply(df, is.numeric), drop = FALSE]
      b<-df[,sapply(df, Negate(is.numeric)), drop = FALSE]
    
      # Change empty strings to NA
      b<-b[lapply(b,function(x) levels(x) <- c(levels(x), NA) ),] # add NA level
      b<-b[lapply(b,function(x) x[x=="",]<- NA),]                 # change Null to NA
    
      # Put the columns back together
      d<-cbind(a,b)
      d[, names(df)]
    }
    

    However, I'm getting this error:

    > foo<-nullToNA(bar)  
    Error in x[x == "", ] <- NA : incorrect number of subscripts on matrix  
    Called from: FUN(X[[i]], ...)
    

    I have tried the answer found here: Replace all 0 values to NA but it changes all my columns to numeric values.