Apply a function to every row of a matrix or a data frame

290,366

Solution 1

You simply use the apply() function:

R> M <- matrix(1:6, nrow=3, byrow=TRUE)
R> M
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
R> apply(M, 1, function(x) 2*x[1]+x[2])
[1]  4 10 16
R> 

This takes a matrix and applies a (silly) function to each row. You pass extra arguments to the function as fourth, fifth, ... arguments to apply().

Solution 2

Here is a short example of applying a function to each row of a matrix. (Here, the function applied normalizes every row to 1.)

Note: The result from the apply() had to be transposed using t() to get the same layout as the input matrix A.

A <- matrix(c(
  0, 1, 1, 2,
  0, 0, 1, 3,
  0, 0, 1, 3
), nrow = 3, byrow = TRUE)

t(apply(A, 1, function(x) x / sum(x) ))

Result:

     [,1] [,2] [,3] [,4]
[1,]    0 0.25 0.25 0.50
[2,]    0 0.00 0.25 0.75
[3,]    0 0.00 0.25 0.75

Solution 3

In case you want to apply common functions such as sum or mean, you should use rowSums or rowMeans since they're faster than apply(data, 1, sum) approach. Otherwise, stick with apply(data, 1, fun). You can pass additional arguments after FUN argument (as Dirk already suggested):

set.seed(1)
m <- matrix(round(runif(20, 1, 5)), ncol=4)
diag(m) <- NA
m
     [,1] [,2] [,3] [,4]
[1,]   NA    5    2    3
[2,]    2   NA    2    4
[3,]    3    4   NA    5
[4,]    5    4    3   NA
[5,]    2    1    4    4

Then you can do something like this:

apply(m, 1, quantile, probs=c(.25,.5, .75), na.rm=TRUE)
    [,1] [,2] [,3] [,4] [,5]
25%  2.5    2  3.5  3.5 1.75
50%  3.0    2  4.0  4.0 3.00
75%  4.0    3  4.5  4.5 4.00

Solution 4

Apply does the job well, but is quite slow. Using sapply and vapply could be useful. dplyr's rowwise could also be useful Let's see an example of how to do row wise product of any data frame.

a = data.frame(t(iris[1:10,1:3]))
vapply(a, prod, 0)
sapply(a, prod)

Note that assigning to variable before using vapply/sapply/ apply is good practice as it reduces time a lot. Let's see microbenchmark results

a = data.frame(t(iris[1:10,1:3]))
b = iris[1:10,1:3]
microbenchmark::microbenchmark(
    apply(b, 1 , prod),
    vapply(a, prod, 0),
    sapply(a, prod) , 
    apply(iris[1:10,1:3], 1 , prod),
    vapply(data.frame(t(iris[1:10,1:3])), prod, 0),
    sapply(data.frame(t(iris[1:10,1:3])), prod) ,
    b %>%  rowwise() %>%
        summarise(p = prod(Sepal.Length,Sepal.Width,Petal.Length))
)

Have a careful look at how t() is being used

Solution 5

First step would be making the function object, then applying it. If you want a matrix object that has the same number of rows, you can predefine it and use the object[] form as illustrated (otherwise the returned value will be simplified to a vector):

bvnormdens <- function(x=c(0,0),mu=c(0,0), sigma=c(1,1), rho=0){
     exp(-1/(2*(1-rho^2))*(x[1]^2/sigma[1]^2+
                           x[2]^2/sigma[2]^2-
                           2*rho*x[1]*x[2]/(sigma[1]*sigma[2]))) * 
     1/(2*pi*sigma[1]*sigma[2]*sqrt(1-rho^2))
     }
 out=rbind(c(1,2),c(3,4),c(5,6));

 bvout<-matrix(NA, ncol=1, nrow=3)
 bvout[] <-apply(out, 1, bvnormdens)
 bvout
             [,1]
[1,] 1.306423e-02
[2,] 5.931153e-07
[3,] 9.033134e-15

If you wanted to use other than your default parameters then the call should include named arguments after the function:

bvout[] <-apply(out, 1, FUN=bvnormdens, mu=c(-1,1), rho=0.6)

apply() can also be used on higher dimensional arrays and the MARGIN argument can be a vector as well as a single integer.

Share:
290,366

Related videos on Youtube

Tim
Author by

Tim

Elitists are oppressive, anti-intellectual, ultra-conservative, and cancerous to the society, environment, and humanity. Please help make Stack Exchange a better place. Expose elite supremacy, elitist brutality, and moderation injustice to https://stackoverflow.com/contact (complicit community managers), in comments, to meta, outside Stack Exchange, and by legal actions. Push back and don't let them normalize their behaviors. Changes always happen from the bottom up. Thank you very much! Just a curious self learner. Almost always upvote replies. Thanks for enlightenment! Meanwhile, Corruption and abuses have been rampantly coming from elitists. Supportive comments have been removed and attacks are kept to control the direction of discourse. Outright vicious comments have been removed only to conceal atrocities. Systematic discrimination has been made into policies. Countless users have been harassed, persecuted, and suffocated. Q&amp;A sites are for everyone to learn and grow, not for elitists to indulge abusive oppression, and cover up for each other. https://softwareengineering.stackexchange.com/posts/419086/revisions https://math.meta.stackexchange.com/q/32539/ (https://i.stack.imgur.com/4knYh.png) and https://math.meta.stackexchange.com/q/32548/ (https://i.stack.imgur.com/9gaZ2.png) https://meta.stackexchange.com/posts/353417/timeline (The moderators defended continuous harassment comments showing no reading and understanding of my post) https://cs.stackexchange.com/posts/125651/timeline (a PLT academic had trouble with the books I am reading and disparaged my self learning posts, and a moderator with long abusive history added more insults.) https://stackoverflow.com/posts/61679659/revisions (homework libels) Much more that have happened.

Updated on March 30, 2022

Comments

  • Tim
    Tim about 2 years

    Suppose I have a n by 2 matrix and a function that takes a 2-vector as one of its arguments. I would like to apply the function to each row of the matrix and get a n-vector. How to do this in R?

    For example, I would like to compute the density of a 2D standard Normal distribution on three points:

    bivariate.density(x = c(0, 0), mu = c(0, 0), sigma = c(1, 1), rho = 0){
        exp(-1/(2*(1-rho^2))*(x[1]^2/sigma[1]^2+x[2]^2/sigma[2]^2-2*rho*x[1]*x[2]/(sigma[1]*sigma[2]))) * 1/(2*pi*sigma[1]*sigma[2]*sqrt(1-rho^2))
    }
    
    out <- rbind(c(1, 2), c(3, 4), c(5, 6))
    

    How to apply the function to each row of out?

    How to pass values for the other arguments besides the points to the function in the way you specify?

  • Tim
    Tim over 13 years
    Thanks! What if the rows of the matrix is not the first arg of the function? How to specify which arg of the function each row of the matrix is assigned to?
  • Dirk Eddelbuettel
    Dirk Eddelbuettel over 13 years
    Read the help for apply() -- it sweeps by row (when the second arg is 1, else by column), and the current row (or col) is always the first argument. That is how things are defined.
  • Joris Meys
    Joris Meys over 13 years
    @Tim : if you use an internal R function and the row is not the first arg, do as Dirk did and make your own custom function where row is the first arg.
  • Paul Hiemstra
    Paul Hiemstra over 12 years
    The plyr package provides a wide range of these apply kinds of functions. It also provides more functionality, including parallel processing.
  • cryptic0
    cryptic0 over 6 years
    Can you explain what 1 means in apply(M, 1...)?
  • De Novo
    De Novo about 6 years
    @cryptic0 this answer is late, but for googlers, the second argument in apply is the MARGIN argument. Here it means apply the function to the rows (the first dimension in dim(M)). If it were 2, it would apply the function to the columns.
  • DaSpeeg
    DaSpeeg over 5 years
    It might be more fair to compare the apply family if you used b <- t(iris[1:10, 1:3]) and apply(b, 2 prod).