How to simply count number of rows with NAs - R
Solution 1
tl;dr: row wise, you'll want sum(!complete.cases(DF))
, or, equivalently, sum(apply(DF, 1, anyNA))
There are a number of different ways to look at the number, proportion or position of NA
values in a data frame:
Most of these start with the logical data frame with TRUE
for every NA
, and FALSE
everywhere else. For the base dataset airquality
is.na(airquality)
There are 44 NA
values in this data set
sum(is.na(airquality))
# [1] 44
You can look at the total number of NA
values per row or column:
head(rowSums(is.na(airquality)))
# [1] 0 0 0 0 2 1
colSums(is.na(airquality))
# Ozone Solar.R Wind Temp Month Day
37 7 0 0 0 0
You can use anyNA()
in place of is.na()
as well:
# by row
head(apply(airquality, 1, anyNA))
# [1] FALSE FALSE FALSE FALSE TRUE TRUE
sum(apply(airquality, 1, anyNA))
# [1] 42
# by column
head(apply(airquality, 2, anyNA))
# Ozone Solar.R Wind Temp Month Day
# TRUE TRUE FALSE FALSE FALSE FALSE
sum(apply(airquality, 2, anyNA))
# [1] 2
complete.cases()
can be used, but only row-wise:
sum(!complete.cases(airquality))
# [1] 42
Solution 2
From the example here:
DF <- read.table(text=" col1 col2 col3
1 23 17 NA
2 55 NA NA
3 24 12 13
4 34 23 12", header=TRUE)
You can check which rows have at least one NA:
(which_nas <- apply(DF, 1, function(X) any(is.na(X))))
# 1 2 3 4
# TRUE TRUE FALSE FALSE
And then count them, identify them or get the ratio:
## Identify them
which(which_nas)
# 1 2
# 1 2
## Count them
length(which(which_nas))
#[1] 2
## Ratio
length(which(which_nas))/nrow(DF)
#[1] 0.5
Chris
Updated on June 19, 2020Comments
-
Chris almost 4 years
I'm trying to compute the number of rows with NA of the whole df as I'm looking to compute the % of rows with NA over the total number of rows of the df.
I have already have seen this post: Determine the number of rows with NAs but it just shows a specific range of columns.
-
G5W almost 6 yearsUse the same answer as the post that you cite, but remove the column restriction.
-
-
Chris almost 6 yearsI'm trying to get the number of rows with NA values, as I already did with just NA values.
-
De Novo almost 6 yearsI'm getting there... give me a bit :)
-
De Novo almost 6 yearsThere, should be a little more comprehensive than just the one-liner for the specific question
-
Robert Krzyzanowski almost 6 yearsFYI, you should use
apply(is.na(airquality), 1, any))
for much better performance results -- type conversion to lists usingapply
is always slow. -
De Novo almost 6 years@RobertKrzyzanowski In my experience and tests,
anyNA
was superior toany(is.na())
, but it's been a while since I've run them... -
De Novo almost 6 years@RobertKrzyzanowski ahh.. I think i see what you're saying now. I'll test it