How to convert a matrix to a list of column-vectors in R?
Solution 1
In the interests of skinning the cat, treat the array as a vector as if it had no dim attribute:
split(x, rep(1:ncol(x), each = nrow(x)))
Solution 2
Gavin's answer is simple and elegant. But if there are many columns, a much faster solution would be:
lapply(seq_len(ncol(x)), function(i) x[,i])
The speed difference is 6x in the example below:
> x <- matrix(1:1e6, 10)
> system.time( as.list(data.frame(x)) )
user system elapsed
1.24 0.00 1.22
> system.time( lapply(seq_len(ncol(x)), function(i) x[,i]) )
user system elapsed
0.2 0.0 0.2
Solution 3
data.frames are stored as lists, I believe. Therefore coercion seems best:
as.list(as.data.frame(x))
> as.list(as.data.frame(x))
$V1
[1] 1 2 3 4 5
$V2
[1] 6 7 8 9 10
Benchmarking results are interesting. as.data.frame is faster than data.frame, either because data.frame has to create a whole new object, or because keeping track of the column names is somehow costly (witness the c(unname()) vs c() comparison)? The lapply solution provided by @Tommy is faster by an order of magnitude. The as.data.frame() results can be somewhat improved by coercing manually.
manual.coerce <- function(x) {
x <- as.data.frame(x)
class(x) <- "list"
x
}
library(microbenchmark)
x <- matrix(1:10,ncol=2)
microbenchmark(
tapply(x,rep(1:ncol(x),each=nrow(x)),function(i)i) ,
as.list(data.frame(x)),
as.list(as.data.frame(x)),
lapply(seq_len(ncol(x)), function(i) x[,i]),
c(unname(as.data.frame(x))),
c(data.frame(x)),
manual.coerce(x),
times=1000
)
expr min lq
1 as.list(as.data.frame(x)) 176221 183064
2 as.list(data.frame(x)) 444827 454237
3 c(data.frame(x)) 434562 443117
4 c(unname(as.data.frame(x))) 257487 266897
5 lapply(seq_len(ncol(x)), function(i) x[, i]) 28231 35929
6 manual.coerce(x) 160823 167667
7 tapply(x, rep(1:ncol(x), each = nrow(x)), function(i) i) 1020536 1036790
median uq max
1 186486 190763 2768193
2 460225 471346 2854592
3 449960 460226 2895653
4 271174 277162 2827218
5 36784 37640 1165105
6 171088 176221 457659
7 1052188 1080417 3939286
is.list(manual.coerce(x))
[1] TRUE
Solution 4
Converting to a data frame thence to a list seems to work:
> as.list(data.frame(x))
$X1
[1] 1 2 3 4 5
$X2
[1] 6 7 8 9 10
> str(as.list(data.frame(x)))
List of 2
$ X1: int [1:5] 1 2 3 4 5
$ X2: int [1:5] 6 7 8 9 10
Solution 5
Using plyr
can be really useful for things like this:
library("plyr")
alply(x,2)
$`1`
[1] 1 2 3 4 5
$`2`
[1] 6 7 8 9 10
attr(,"class")
[1] "split" "list"
Joris Meys
Statistician and R programmer at the faculty of Bio-Engineering, university of Ghent Co-author of 'R for Dummies' ( 2nd edition released in 2015 ) contact : Joris - dot - Meys - at - Ugent - dot - be
Updated on May 13, 2021Comments
-
Joris Meys about 3 years
Say you want to convert a matrix to a list, where each element of the list contains one column.
list()
oras.list()
obviously won't work, and until now I use a hack using the behaviour oftapply
:x <- matrix(1:10,ncol=2) tapply(x,rep(1:ncol(x),each=nrow(x)),function(i)i)
I'm not completely happy with this. Anybody knows a cleaner method I'm overlooking?
(for making a list filled with the rows, the code can obviously be changed to :
tapply(x,rep(1:nrow(x),ncol(x)),function(i)i)
)
-
Ari B. Friedman almost 13 yearsBeaten by Gavin by 5 seconds. Darn you, "Are you a human" screen? :-)
-
Ari B. Friedman almost 13 yearsInteresting. I think this also works by coercion.
c(as.data.frame(x))
produces identical behavior toas.list(as.data.frame(x)
-
Gavin Simpson almost 13 yearsLuck of the draw I guess, I was just viewing this after @Joris snuck in ahead of me answering Perter Flom's Q. Also,
as.data.frame()
looses the names of the data frame, sodata.frame()
is a little nicer. -
Dilettant almost 13 yearsI think that this is so, because the members of the sample lists / matrix are of the same type, but I am not an expeRt.
-
Marek almost 13 yearsThis is core of what
tapply
do. But it's simpler :). Probably slower but nice-looking solution will besplit(x, col(x))
(andsplit(x, row(x))
respectively). -
Marek almost 13 yearsEquivalent of
manual.coerce(x)
could beunclass(as.data.frame(x))
. -
Ari B. Friedman almost 13 yearsThanks Marek. That's about 6% faster, presumably because I can avoid using a function definition/call.
-
Marek almost 13 yearsI checked it. Equally fast will be
split(x, c(col(x)))
. But it looks worse. -
mdsumner almost 13 yearssplit(x, col(x)) looks better - implicit coercion to vector is fine . . .
-
Gavin Simpson almost 13 years+1 Good point about relative efficiency of the various solutions. The best Answer thus far.
-
Joris Meys about 12 yearsAfter much testing, this seems to work the fastest, especially with a lot of row or columns.
-
alfymbohm almost 11 yearsOf course this drops column names, but it doesn't seem they were important in the original question.
-
alfymbohm almost 11 yearsTommy's solution is faster and more compact:
system.time( lapply(seq_len(ncol(x)), function(i) x[,i]) ) user: 1.668 system: 0.016 elapsed: 1.693
-
baptiste over 9 yearsor
unlist(apply(x, 2, list), recursive = FALSE)
-
Rich Scriven over 9 yearsYep. You should add that as an answer @baptiste.
-
baptiste over 9 yearsbut that would require scrolling down to the bottom of the page! i'm way too lazy for that
-
Rich Scriven over 9 yearsThere's an "END" button on my machine... :-)
-
Rich Scriven over 9 yearsI think this can probably also be done by creating an empty list and filling it up.
y <- vector("list", ncol(x))
and then something along the lines ofy[1:2] <- x[,1:2]
, although it doesn't work that exact way. -
Simon C. almost 6 yearsindeed, as underlined by @JorisMeys this solution is fastest. Benchmark with matrix 100000x150: mean time = 2.897655s with this solution and 3.096364s with tapply.
-
banbh over 5 yearsNote that if
x
has column names thensplit(x, col(x, as.factor = TRUE))
will preserve the names. -
skan over 4 yearsBut I think in order to get the same results you need to do lapply(seq_len(nrow(x)), function(i) x[i,]) and then is slower.
-
mshaffer over 3 yearsTrying to figure this out in a different context, doesn't work: stackoverflow.com/questions/63801018 .... looking for this:
vec2 = castMatrixToSequenceOfLists(vecs);