How to select columns based on criteria in a certain row in R

r subset

10,317

Solution 1

I assume you mean 0.50 since all the columns with the "Perc" are above 50.0.

this might not be the best way but it works:

#data:
df <- data.frame(Days=c(0.01,0.02,0.03,"Perc"),J1=c(458,459,457,0.99),
J2 =c(-165,-163,-160,0.04),J3=c(-151,-153,-131,0.00),J4=c(-52,-45,-51,0.52))

dfc <- subset(df,,select= which(c(TRUE,(df[which(df$Days == "Perc"), ] <= 0.50)[2:5])))

dfc
  Days      J2   J3
1 0.01 -165.00 -151
2 0.02 -163.00 -153
3 0.03 -160.00 -131
4 Perc    0.04    0

You can remove the TRUE, if you dont want the df$Days variable, change the 0.50 threshold if needed and expand the 2:5 if you have extra columns or even substitute the "Perc" with 1414 if you so wish.

Hope this works.

Solution 2

Presumably you meant <= 0.50 and not <= 50 since all "Perc" are less than 50. You can do

df[, unlist(df["Perc",]) <= 0.5]
#           J2   J3
# 0.01 -160.00 -151
# 0.02 -163.00 -154
# 0.03 -165.00 -150
# Perc    0.04    0

But this may be safer and takes into account any NA values that may appear in "Perc".

u <- unlist(df["Perc",]) <= 0.50
df[, u & !is.na(u)]

Also, you can speed it up if need be by adding use.names = FALSE in unlist(). And finally, if you have a matrix and not a data frame, then you can remove unlist() all together.

10,317

Author by

user507

Updated on June 28, 2022

Comments

user507 almost 2 years
I have a matrix of values with both row names and column names, as shown here.

C5.Outliers
```
Days   J1      J2      J3      J4  
0.01   458    -160    -151    -52     
0.02   459    -163    -154    -46    
0.03   457    -165    -150    -51   
Perc   0.99   0.04    0.00    0.52     
```
I would like to create a separate matrix using only the columns for which the value for the row "Perc" is =<50.0. In this example, I would be extracting columns J2 and J3.

This is the code I tried which isn't working (the "Perc" row is row #1414 on my matrix): C5.Final<-subset(C5.Outliers, 1414<.51)