How to convert a huge list-of-vector to a matrix more efficiently?
Solution 1
This should be equivalent to your current code, only a lot faster:
output <- matrix(unlist(z), ncol = 10, byrow = TRUE)
Solution 2
I think you want
output <- do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))
i.e. combining @BlueMagister's use of do.call(rbind,...)
with an lapply
statement to convert the individual list elements into 11*10 matrices ...
Benchmarks (showing @flodel's unlist
solution is 5x faster than mine, and 230x faster than the original approach ...)
n <- 1000
z <- replicate(n,matrix(1:110,ncol=10,byrow=TRUE),simplify=FALSE)
library(rbenchmark)
origfn <- function(z) {
output <- NULL
for(i in 1:length(z))
output<- rbind(output,matrix(z[[i]],ncol=10,byrow=TRUE))
}
rbindfn <- function(z) do.call(rbind,lapply(z,matrix,ncol=10,byrow=TRUE))
unlistfn <- function(z) matrix(unlist(z), ncol = 10, byrow = TRUE)
## test replications elapsed relative user.self sys.self
## 1 origfn(z) 100 36.467 230.804 34.834 1.540
## 2 rbindfn(z) 100 0.713 4.513 0.708 0.012
## 3 unlistfn(z) 100 0.158 1.000 0.144 0.008
If this scales appropriately (i.e. you don't run into memory problems), the full problem would take about 130*0.2 seconds = 26 seconds on a comparable machine (I did this on a 2-year-old MacBook Pro).
Solution 3
It would help to have sample information about your output. Recursively using rbind
on bigger and bigger things is not recommended. My first guess at something that would help you:
z <- list(1:3,4:6,7:9)
do.call(rbind,z)
See a related question for more efficiency, if needed.
Solution 4
You can also use,
output <- as.matrix(as.data.frame(z))
The memory usage is very similar to
output <- matrix(unlist(z), ncol = 10, byrow = TRUE)
Which can be verified, with mem_changed()
from library(pryr)
.
user1787675
Updated on July 05, 2022Comments
-
user1787675 almost 2 years
I have a list of length 130,000 where each element is a character vector of length 110. I would like to convert this list to a matrix with dimension 1,430,000*10. How can I do it more efficiently?\ My code is :
output=NULL for(i in 1:length(z)) { output=rbind(output, matrix(z[[i]],ncol=10,byrow=TRUE)) }
-
Ben Bolker over 11 yearsBingo. This should be much faster than my solution too, but I couldn't think of it fast enough.
-
user1787675 over 11 yearsThat's magical! It takes about 20 seconds to do this on my one-year-old toshiba machine, which saves me a lot of time. And your function to show the run time is very interesting too.
-
Joshua Ulrich over 11 years+1, but I'd recommend setting
USE.NAMES=FALSE
inunlist
in order to save time and memory. -
Johan Larsson over 7 yearsIt should be
use.names
(i.e. in lowercase). -
mikey over 4 yearsJust to clarify, it should be
output <- matrix(unlist(z), ncol = 10, byrow = TRUE, use.names=FALSE)
to be the most efficient. -
Felix over 3 years@mikey Almost. It should be: output <- matrix(unlist(z, use.names = FALSE), ncol = 10, byrow = TRUE)