r - Fast way to loop through matrix and compute and assign a value to each element?

11,083

Try

f <- function(prob.mat) 
    matrix(rbinom(prob.mat, 1, prob.mat), ncol = ncol(prob.mat))

diCases <- f(Pcases)
diConts <- f(Pconts)
Share:
11,083
user3821273
Author by

user3821273

Updated on June 04, 2022

Comments

  • user3821273
    user3821273 almost 2 years

    I have a large matrix of probabilities (call it A), N by 806, where N is typically a number in the thousands.

    Using this matrix of probabilities, I want to create another matrix (call it B), N by 806, that contains only binary values. The value in B[i,j] is determined by using the corresponding probability in A[i,j] via binomial. The code I am using is below:

    diCases <- matrix(0, nrow = numcases, ncol = numdis)
    diConts <- matrix(0, nrow = numconts, ncol = numdis)
    
    for(row in 1:nrow(diCases)) {
        print(paste('Generating disease profile for case', row, '...'))
        for(col in 1:ncol(diCases)) {
            pDis <- Pcases[row, col]
            diCases[row, col] <- rbinom(1, 1, pDis)
        }
    }
    
    for(row in 1:nrow(diConts)) {
        print(paste('Generating disease profile for control', row, '...'))
        for(col in 1:ncol(diConts)) {
            pDis <- Pconts[row, col]
            diConts[row, col] <- rbinom(1, 1, pDis)
        }
    }
    

    Basically, I have resorted to using nested for loops, looping through every column in each row and moving on to the next row, assigning a 1 or 0 based on the result of:

    rbinom(1, 1, pDis)
    

    where pDis is the A[i,j] mentioned in the beginning. As you can imagine, this is pretty slow and is the main bottleneck in my code. This block of code is in a simulation that I had planned to run over and over again, ideally in a short period of time.

    Is there a faster way to accomplish this? I looked into the "apply" functions but couldn't really figure out how to make it work for this particular task.

    Thank you all in advance.

  • user3821273
    user3821273 almost 10 years
    Thank for this. It is definitely an elegant alternative to using nested for loops. However, the time it takes to run seems to be the same. Is there any way to increase the speed?
  • user3821273
    user3821273 almost 10 years
    Calling rbinom directly worked perfectly. Thanks so much. I'm guessing the speed increase is due to vectorization?
  • konvas
    konvas almost 10 years
    It's because rbinom is designed to do this :) anything else adds an overhead which is not needed
  • Calimo
    Calimo almost 10 years
    It's worth noting that many R functions are designed to do this. Loops are slow, whenever you encounter one you should think how to remove it, and most often you will find a way to actually do it without a loop.