Divide each data frame row by vector in R

32,571

Solution 1

sweep is useful for these sorts of operations, but it requires a matrix as input. As such, convert your data frame to a matrix, do the operation and then convert back. For example, some dummy data where we divide each element in respective columns of matrix mat by the corresponding value in the vector vec:

mat <- matrix(1:25, ncol = 5)
vec <- seq(2, by = 2, length = 5)

sweep(mat, 2, vec, `/`)

In use we have:

> mat
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    6   11   16   21
[2,]    2    7   12   17   22
[3,]    3    8   13   18   23
[4,]    4    9   14   19   24
[5,]    5   10   15   20   25
> vec
[1]  2  4  6  8 10
> sweep(mat, 2, vec, `/`)
     [,1] [,2]     [,3]  [,4] [,5]
[1,]  0.5 1.50 1.833333 2.000  2.1
[2,]  1.0 1.75 2.000000 2.125  2.2
[3,]  1.5 2.00 2.166667 2.250  2.3
[4,]  2.0 2.25 2.333333 2.375  2.4
[5,]  2.5 2.50 2.500000 2.500  2.5
> mat[,1] / vec[1]
[1] 0.5 1.0 1.5 2.0 2.5

To convert from a data frame use as.matrix(df) or data.matrix(df), and as.data.frame(mat) for the reverse.

Solution 2

Suppose we have a dataframe, df:

> df
  a b   c
1 1 3 100
2 2 4 110

And we want to divide through each row by the same vector, vec:

> vec <- df[1,]
> vec
  a b   c
1 1 3 100

Then we can use mapply as follows:

> mapply('/', df, vec)
     a        b   c
[1,] 1 1.000000 1.0
[2,] 2 1.333333 1.1

Solution 3

This is nothing but element-wise matrix multiplication:

mat <- matrix(c(4,2,2,6,7,6, 93,73,88,86,58,65, 123,103,96,128,46,57), nrow=3, byrow=T)

vec = c(1.0660880,0.9104053,0.8642545,0.9611866,0.9711406,1.0560121)

mat %o% 1/vec

           [,1]      [,2]       [,3]       [,4]      [,5]      [,6]
[1,]   3.752035  2.080761   1.876018   6.242284  6.566062  6.242284
[2,] 102.152305 75.169342  96.660246  88.555663 63.707889 66.931606
[3,] 142.319190 97.536761 111.078392 121.210732 53.225063 53.976654

To do that we used the outer-product approach, since directly trying mat %*% 1/vec gives an error on non-conformable arguments because they have different shapes. Or look at the many posts on https://stackoverflow.com/search?q=%5Br%5D+multiply+matrix+by+vector

Solution 4

Just for variety, you could also use mapply

mx <- structure(list(X131.478.1 = c(4L, 93L, 123L), X131.478.2 = c(2L, 
73L, 103L), X131.NSC.1 = c(2L, 88L, 96L), X131.NSC.2 = c(6L, 
86L, 128L), X166.478.1 = c(7L, 58L, 46L), X166.478.2 = c(6L, 
65L, 57L)), .Names = c("X131.478.1", "X131.478.2", "X131.NSC.1", 
"X131.NSC.2", "X166.478.1", "X166.478.2"), class = "data.frame", row.names = c("1/2-SBSRNA4", 
"A1BG", "A1BG-AS1"))

sf <- structure(list(V1 = c(1.066088, 0.9104053, 0.8642545, 0.9611866, 
0.9711406, 1.0560121)), .Names = "V1", row.names = c("X131.478.1", 
"X131.478.2", "X131.NSC.1", "X131.NSC.2", "X166.478.1", "X166.478.2"
), class = "data.frame")


mapply(function(x, y) x * y, mx, t(sf))


    X131.478.1 X131.478.2 X131.NSC.1 X131.NSC.2 X166.478.1 X166.478.2
[1,]   4.264352   1.820811   1.728509    5.76712   6.797984   6.336073
[2,]  99.146184  66.459587  76.054396   82.66205  56.326155  68.640787
[3,] 131.128824  93.771746  82.968432  123.03188  44.672468  60.192690

But for this I think Josh's answer is better... and Gavin's is even better!

Share:
32,571

Related videos on Youtube

Ramma
Author by

Ramma

Updated on September 19, 2020

Comments

  • Ramma
    Ramma over 3 years

    I'm trying to divide each number within a data frame with 16 columns by a specific number for each column. The numbers are stored as a data frame with 1-16 corresponding to the samples in the larger data frames columns 1-16. There is a single number per column that I need to divide by each number in the larger spreadsheet and print the output to a final spreadsheet.

    Here's and example of what I'm starting with. The spreadsheet to be divided.

                X131.478.1 X131.478.2 X131.NSC.1 X131.NSC.2 X166.478.1 X166.478.2
    1/2-SBSRNA4          4          2          2          6          7          6
    A1BG                93         73         88         86         58         65
    A1BG-AS1           123        103         96        128         46         57
    

    The numbers to divide the spreadsheet by

    X131.478.1 1.0660880
    X131.478.2 0.9104053
    X131.NSC.1 0.8642545
    X131.NSC.2 0.9611866
    X166.478.1 0.9711406
    X166.478.2 1.0560121
    

    And the expected results, not necessarily rounded as I did here.

        X131.478.1 X131.478.2 X131.NSC.1 X131.NSC.2 X166.478.1 X166.478.2
    1/2-SBSRNA4          3.75          2.19          2.31          6.24          7.20         5.68
    A1BG                87.23         80.17         101.82         89.47         59.72         61.55
    A1BG-AS1           115.37        113.13         111.07        133.16         47.36         53.97
    

    I tried simply dividing the data frames mx2 = mx/sf with mx being the large data set and sf being the data frame of numbers to divide by. That seemed to divide everything by the first number in the sf data set.

    The numbers for division were generated by estimateSizeFactors, part of the DESeq package if that helps.

    Any help would be great. Thanks!

    • smci
      smci about 9 years
      This is nothing but matrix multiplication by a vector; see my answer.
  • smci
    smci about 9 years
    This is nothing but matrix multiplication by a vector; see my answer.
  • oshun
    oshun about 8 years
    Your output does not match the OP's expected output.
  • smci
    smci about 8 years
    @oshun: Crap, you're right, it's element-wise multiplication(/division). I'll have to make the %*% approach work.
  • Bowecho
    Bowecho over 4 years
    This is not applicable to a data frame, which is what the question asked for.
  • Gavin Simpson
    Gavin Simpson over 4 years
    @Bowecho It's pretty trivial to convert to a matrix for these operations, and needed really as doing things like this to data frames is slow.