How to simply count number of rows with NAs - R

22,247

Solution 1

tl;dr: row wise, you'll want sum(!complete.cases(DF)), or, equivalently, sum(apply(DF, 1, anyNA))

There are a number of different ways to look at the number, proportion or position of NA values in a data frame:

Most of these start with the logical data frame with TRUE for every NA, and FALSE everywhere else. For the base dataset airquality

is.na(airquality)

There are 44 NA values in this data set

sum(is.na(airquality))
# [1] 44

You can look at the total number of NA values per row or column:

head(rowSums(is.na(airquality)))
# [1] 0 0 0 0 2 1
colSums(is.na(airquality))
#   Ozone Solar.R    Wind    Temp   Month     Day 
 37       7       0       0       0       0 

You can use anyNA() in place of is.na() as well:

# by row
head(apply(airquality, 1, anyNA))
# [1] FALSE FALSE FALSE FALSE  TRUE  TRUE
sum(apply(airquality, 1, anyNA))
# [1] 42


# by column
head(apply(airquality, 2, anyNA))
#   Ozone Solar.R    Wind    Temp   Month     Day 
#    TRUE    TRUE   FALSE   FALSE   FALSE   FALSE
sum(apply(airquality, 2, anyNA))
# [1] 2

complete.cases() can be used, but only row-wise:

sum(!complete.cases(airquality))
# [1] 42

Solution 2

From the example here:

DF <- read.table(text="     col1   col2    col3
 1    23    17      NA
 2    55    NA      NA
 3    24    12      13
 4    34    23      12", header=TRUE)

You can check which rows have at least one NA:

(which_nas <- apply(DF, 1, function(X) any(is.na(X))))
#    1     2     3     4 
# TRUE  TRUE FALSE FALSE 

And then count them, identify them or get the ratio:

## Identify them
which(which_nas)
# 1 2 
# 1 2 

## Count them
length(which(which_nas))
#[1] 2

## Ratio
length(which(which_nas))/nrow(DF)
#[1] 0.5
Share:
22,247
Chris
Author by

Chris

Updated on June 19, 2020

Comments

  • Chris
    Chris almost 4 years

    I'm trying to compute the number of rows with NA of the whole df as I'm looking to compute the % of rows with NA over the total number of rows of the df.

    I have already have seen this post: Determine the number of rows with NAs but it just shows a specific range of columns.

    • G5W
      G5W almost 6 years
      Use the same answer as the post that you cite, but remove the column restriction.
  • Chris
    Chris almost 6 years
    I'm trying to get the number of rows with NA values, as I already did with just NA values.
  • De Novo
    De Novo almost 6 years
    I'm getting there... give me a bit :)
  • De Novo
    De Novo almost 6 years
    There, should be a little more comprehensive than just the one-liner for the specific question
  • Robert Krzyzanowski
    Robert Krzyzanowski almost 6 years
    FYI, you should use apply(is.na(airquality), 1, any)) for much better performance results -- type conversion to lists using apply is always slow.
  • De Novo
    De Novo almost 6 years
    @RobertKrzyzanowski In my experience and tests, anyNA was superior to any(is.na()), but it's been a while since I've run them...
  • De Novo
    De Novo almost 6 years
    @RobertKrzyzanowski ahh.. I think i see what you're saying now. I'll test it