Equivalent to rowMeans() for min()
Solution 1
You could use pmin
, but you would have to get each column of your matrix into a separate vector. One way to do that is to convert it to a data.frame then call pmin
via do.call
(since data.frames are lists).
system.time(do.call(pmin, as.data.frame(m)))
# user system elapsed
# 0.940 0.000 0.949
system.time(apply(m,1,min))
# user system elapsed
# 16.84 0.00 16.95
Solution 2
Quite late to the party, but as the author of matrixStats and in case someone spots this, please note that matrixStats::rowMins()
is very fast these days, e.g.
library(microbenchmark)
library(Biobase) # rowMin()
library(matrixStats) # rowMins()
options(digits=3)
m <- matrix(rnorm(10000000), ncol=10)
stats <- microbenchmark(
rowMeans(m), ## A benchmark by OP
rowMins(m),
rowMin(m),
do.call(pmin, as.data.frame(m)),
apply(m, MARGIN=1L, FUN=min),
times=10
)
> stats
Unit: milliseconds
expr min lq mean median uq max
rowMeans(m) 77.7 82.7 85.7 84.4 90.3 98.2
rowMins(m) 72.9 74.1 88.0 79.0 90.2 147.4
rowMin(m) 341.1 347.1 395.9 383.4 395.1 607.7
do.call(pmin, as.data.frame(m)) 326.4 357.0 435.4 401.0 437.6 657.9
apply(m, MARGIN = 1L, FUN = min) 3761.9 3963.8 4120.6 4109.8 4198.7 4567.4
Solution 3
If you want to stick to CRAN packages, then both the matrixStats
and the fBasics
packages have the function rowMins
[note the s
which is not in the Biobase
function] and a variety of other row and column statistics.
Solution 4
library("sos")
findFn("rowMin")
gets a hit in the Biobase
package, from Bioconductor ...
source("http://bioconductor.org/biocLite.R")
biocLite("Biobase")
m <- matrix(rnorm(10000000), ncol=10)
system.time(rowMeans(m))
## user system elapsed
## 0.132 0.148 0.279
system.time(apply(m,1,min))
## user system elapsed
## 11.825 1.688 13.603
library(Biobase)
system.time(rowMin(m))
## user system elapsed
## 0.688 0.172 0.864
Not as fast as rowMeans
, but a lot faster than apply(...,1,min)
Solution 5
I've been meaning to try out the new compiler
package in R 2.13.0. This essentially follows the post outlined by Dirk here.
library(compiler)
library(rbenchmark)
rowMin <- function(x, ind) apply(x, ind, min)
crowMin <- cmpfun(rowMin)
benchmark(
rowMin(m,1)
, crowMin(m,1)
, columns=c("test", "replications","elapsed","relative")
, order="relative"
, replications=10)
)
And the results:
test replications elapsed relative
2 crowMin(m, 1) 10 120.091 1.0000
1 rowMin(m, 1) 10 122.745 1.0221
Anticlimatic to say the least, though looks like you've gotten some other good options.
johannes
Updated on June 06, 2022Comments
-
johannes almost 2 years
I have seen this question being asked multiple times on the R mailing list, but still could not find a satisfactory answer.
Suppose I a matrix
m
m <- matrix(rnorm(10000000), ncol=10)
I can get the mean of each row by:
system.time(rowMeans(m)) user system elapsed 0.100 0.000 0.097
But obtaining the minimum value of each row by
system.time(apply(m,1,min)) user system elapsed 16.157 0.400 17.029
takes more than 100 times as long, is there a way to speed this up?
-
Chase almost 13 yearsI like the use of
do.call
. I thought ofpmin
, but didn't think of a slick way to incorporate it. All the cool kids seem to be able to usedo.call
to achieve their goals...I need to do some reading on this. -
Joshua Ulrich almost 13 years
do.call
comes in handy when you want to be able to create function arguments dynamically (generally when the number of arguments passed via...
isn't known). -
johannes almost 13 yearsNice answer, thanks! with pmin.int() it was even a tiny bit faster
-
johannes almost 13 yearsthanks, I wasn't aware of the sos package and rowMin solves my problem too.
-
johannes almost 13 yearsthanks for your answer, I will have to look deeper into your answer, that's new terrain for me :)
-
Marek almost 13 yearsHadley have nice vocabulary of functions that you need to know. There is
pmin
too. -
Marek almost 13 yearsCompiler is better in optimization of explicit loops. Try e.g.:
rowMin <- function(x) {n <- nrow(x);r <- numeric(n);for (i in 1:n) r[i] <- min(x[i,]);r}
-
Roman Luštrik almost 13 yearsCare to time the
do.call
solution as well? -
Marek almost 13 yearsAnd another speed gain by
pmin(m[,1], m[,2], m[,3], m[,4], m[,5], m[,6], m[,7], m[,8], m[,9], m[,10])
. Joshuaas.data.frame
is time consuming. -
Marek almost 13 yearsSome variation of your answer:
do.call(pmin, lapply(seq_len(ncol(m)), function(i) m[,i]))
-
mdsumner almost 13 yearsnot speedy for typing though, or general to different inputs :)
-
Marek almost 13 yearsI add more general solution in comment to Joshua answer.
-
danas.zuokas almost 12 years
@Marek
, how to handle NA values in your solution? -
Joshua Ulrich almost 12 years@danas.zuokas:
do.call(pmin, c(na.rm=TRUE, lapply(...)))
-
skan almost 9 years@HenirkB it would be great if matrixStats rowMins also worked on data.frames, (without the need of transform it to matrix first)
-
HenrikB almost 9 years@skan, unfortunately it's not obvious that this belongs to matrixStats for various reasons, please see github.com/HenrikBengtsson/matrixStats/issues/18