filter rows when all columns greater than a value

12,011

We can create a logical matrix my comparing the entire data frame with 2 and then do rowSums over it and select only those rows whose value is equal to number of columns in df

df[rowSums(df > 2) == ncol(df), ]

#  A B C
#2 4 3 5

A dplyr approach using filter_all and all_vars

library(dplyr) 
df %>% filter_all(all_vars(. > 2))

#  A B C
#1 4 3 5

dplyr > 1.0.0

#1. if_all
df %>% filter(if_all(.fns = ~. > 2))

#2. across
df %>% filter(across(.fns = ~. > 2))

An apply approach

#Using apply
df[apply(df > 2, 1, all), ]
#Using lapply as shared by @thelatemail
df[Reduce(`&`, lapply(df, `>`, 2)),]
Share:
12,011

Related videos on Youtube

say.ff
Author by

say.ff

Updated on October 11, 2022

Comments

  • say.ff
    say.ff over 1 year

    I have a data frame and I would like to subset the rows where all columns values meet my cutoff.

    here is the data frame:

       A B C
    1  1 3 5
    2  4 3 5
    3  2 1 2
    

    What I would like to select is rows where all columns are greater than 2. Second row is what I want to get.

    [1] 4 3 5
    

    here is my code:

     subset_data <- df[which(df[,c(1:ncol(df))] > 2),]
    

    But my code is not applied on all columns. Do you have any idea how can I fix this.

  • thelatemail
    thelatemail almost 6 years
    Functional approach for fun - dat[Reduce(`&`, lapply(dat, `>`, 2)),]
  • cs95
    cs95 almost 6 years
    Oh, looks like apply is universally disliked by all data scientists (pandas too)
  • Ronak Shah
    Ronak Shah almost 6 years
    @coldspeed In R, apply is actually very slow most of the times, especially dealing with data frames as it converts data frame to matrix.
  • amc
    amc almost 3 years
    FYI, df[rowSums(df > 2) == ncol(df), ] gave just the values in R 3.6.1. Using the dplyr version gave the desired data.frame output.
  • Ronak Shah
    Ronak Shah almost 3 years
    @amc that might be because you have only 1 column in the dataframe. Use df[rowSums(df > 2) == ncol(df), , drop = FALSE]