Subset by column index in R - Data.Table vs. dataframe
For data.table, you need to include with=FALSE in your column subset statement.
data[, 3:11, with=FALSE]
ben_says
Updated on June 26, 2022Comments
-
ben_says almost 2 years
install.packages('data.table') library(data.table) data <- read.csv("http://www.ats.ucla.edu/stat/data/hsb2_small.csv") head(data, 10) > id female race ses schtyp prog read write math science socst > 1: 70 0 4 1 1 1 57 52 41 47 57 > 2: 121 1 4 2 1 3 68 59 53 63 61 > 3: 86 0 4 3 1 1 44 33 54 58 31 > 4: 141 0 4 3 1 3 63 44 47 53 56 > 5: 172 0 4 2 1 2 47 52 57 53 61 > 6: 113 0 4 2 1 2 44 52 51 63 61 > 7: 50 0 3 2 1 1 50 59 42 53 61 > 8: 11 0 1 2 1 2 34 46 45 39 36 > 9: 84 0 4 2 1 1 63 57 54 58 51 > 10: 48 0 3 2 1 2 57 55 52 50 51
and we see it is a
class(data) > [1] "data.frame"
so we can snag specific columns (only showing 10 rows for this page's example...)
data[ , c(1, 7, 8)] > id read write > 1 70 57 52 > 2 121 68 59 > 3 86 44 33 > 4 141 63 44 > 5 172 47 52 > 6 113 44 52 > 7 50 50 59 > 8 11 34 46 > 9 84 63 57 > 10 48 57 55
or a range (helpful if you have many variables)
data[ , 3:11] > race ses schtyp prog read write math science socst > 1 4 1 1 1 57 52 41 47 57 > 2 4 2 1 3 68 59 53 63 61 > 3 4 3 1 1 44 33 54 58 31 > 4 4 3 1 3 63 44 47 53 56 > 5 4 2 1 2 47 52 57 53 61 > 6 4 2 1 2 44 52 51 63 61 > 7 3 2 1 1 50 59 42 53 61 > 8 1 2 1 2 34 46 45 39 36 > 9 4 2 1 1 63 57 54 58 51 > 10 3 2 1 2 57 55 52 50 51
Everything works well until I start using data.table.
setDT(data) class(data) > [1] "data.table" "data.frame"
How do I accomplish the similar subsetting with data.table? the same code above yields...
data[ , c(1, 7, 8)] > [1] 1 7 8 data[ , 3:11] > [1] 3 4 5 6 7 8 9 10 11
I am aware of dplyr select() but I seek a solution that doesn't involve typing the column names, and would greatly appreciate a clear method for subsetting a data.table by using a "column number." I have occasionally used subset(), and even gone so far as constructing character vector J for use in data[ I, J, by = K]. I must be missing something. Code-masters would consider this trivial, and easily display a flexible solution allowing one to, for example, select columns 1,3,5, 10 through 30, and 97.
-
A5C1D2H2I1M1N2O1R2T1 over 8 yearsAdd
with = FALSE
in there.
-