filter rows when all columns greater than a value
12,011
We can create a logical matrix my comparing the entire data frame with 2 and then do rowSums
over it and select only those rows whose value is equal to number of columns in df
df[rowSums(df > 2) == ncol(df), ]
# A B C
#2 4 3 5
A dplyr
approach using filter_all
and all_vars
library(dplyr)
df %>% filter_all(all_vars(. > 2))
# A B C
#1 4 3 5
dplyr
> 1.0.0
#1. if_all
df %>% filter(if_all(.fns = ~. > 2))
#2. across
df %>% filter(across(.fns = ~. > 2))
An apply
approach
#Using apply
df[apply(df > 2, 1, all), ]
#Using lapply as shared by @thelatemail
df[Reduce(`&`, lapply(df, `>`, 2)),]
Related videos on Youtube
Author by
say.ff
Updated on October 11, 2022Comments
-
say.ff over 1 year
I have a data frame and I would like to subset the rows where all columns values meet my cutoff.
here is the data frame:
A B C 1 1 3 5 2 4 3 5 3 2 1 2
What I would like to select is rows where all columns are greater than 2. Second row is what I want to get.
[1] 4 3 5
here is my code:
subset_data <- df[which(df[,c(1:ncol(df))] > 2),]
But my code is not applied on all columns. Do you have any idea how can I fix this.
-
thelatemail almost 6 yearsFunctional approach for fun -
dat[Reduce(`&`, lapply(dat, `>`, 2)),]
-
cs95 almost 6 yearsOh, looks like apply is universally disliked by all data scientists (pandas too)
-
Ronak Shah almost 6 years@coldspeed In R,
apply
is actually very slow most of the times, especially dealing with data frames as it converts data frame to matrix. -
amc almost 3 yearsFYI,
df[rowSums(df > 2) == ncol(df), ]
gave just the values in R 3.6.1. Using thedplyr
version gave the desired data.frame output. -
Ronak Shah almost 3 years@amc that might be because you have only 1 column in the dataframe. Use
df[rowSums(df > 2) == ncol(df), , drop = FALSE]