Index values from a matrix using row, col indices

22,613

Solution 1

Almost. Needs to be offered to "[" as a two column matrix:

dat$matval <- mat[ cbind(dat$I, dat$J) ] # should do it.

There is a caveat: Although this also works for dataframes, they are first coerced to matrix-class and if any are non-numeric, the entire matrix becomes the "lowest denominator" class.

Solution 2

Using a matrix to index as DWin suggests is of course much cleaner, but for some strange reason doing it manually using 1-D indices is actually slightly faster:

# Huge sample data
mat <- matrix(sin(1:1e7), ncol=1000)
dat <- data.frame(I=sample.int(nrow(mat), 1e7, rep=T), 
                  J=sample.int(ncol(mat), 1e7, rep=T))

system.time( x <- mat[cbind(dat$I, dat$J)] )     # 0.51 seconds
system.time( mat[dat$I + (dat$J-1L)*nrow(mat)] ) # 0.44 seconds

The dat$I + (dat$J-1L)*nrow(m) part turns the 2-D indices into 1-D ones. The 1L is the way to specify an integer instead of a double value. This avoids some coercions.

...I also tried gsk3's apply-based solution. It's almost 500x slower though:

system.time( apply( dat, 1, function(x,mat) mat[ x[1], x[2] ], mat=mat ) ) # 212

Solution 3

Here's a one-liner using apply's row-based operations

> dat <- as.data.frame(matrix(rep(seq(4),4),ncol=2))
> colnames(dat) <- c('I','J')
> dat
   I  J
1  1  1
2  2  2
3  3  3
4  4  4
5  1  1
6  2  2
7  3  3
8  4  4
> mat <- matrix(seq(16),ncol=4)
> mat
     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

> dat$K <- apply( dat, 1, function(x,mat) mat[ x[1], x[2] ], mat=mat )
> dat
  I J  K
1 1 1  1
2 2 2  6
3 3 3 11
4 4 4 16
5 1 1  1
6 2 2  6
7 3 3 11
8 4 4 16
Share:
22,613

Related videos on Youtube

Mike T
Author by

Mike T

Hydrogeologist, numerical modeller and GIS professional. My main programming languages that I use are Python, R, SQL. I dabble with Fortran and C/C++/C# on occasions. Thanks to anyone that has helped me!

Updated on September 06, 2020

Comments

  • Mike T
    Mike T over 3 years

    I have a 2D matrix mat with 500 rows × 335 columns, and a data.frame dat with 120425 rows. The data.frame dat has two columns I and J, which are integers to index the row, column from mat. I would like to add the values from mat to the rows of dat.

    Here is my conceptual fail:

    > dat$matval <- mat[dat$I, dat$J]
    Error: cannot allocate vector of length 1617278737
    

    (I am using R 2.13.1 on Win32). Digging a bit deeper, I see that I'm misusing matrix indexing, as it appears that I'm only getting a sub-matrix of mat, and not a single-dimension array of values as I expected, i.e.:

    > str(mat[dat$I[1:100], dat$J[1:100]])
     int [1:100, 1:100] 20 1 1 1 20 1 1 1 1 1 ...
    

    I was expecting something like int [1:100] 20 1 1 1 20 1 1 1 1 1 .... What is the correct way to index a 2D matrix using indices of row, column to get the values?

    • Ari B. Friedman
      Ari B. Friedman almost 13 years
      +1 for an interesting question (which begs another question: why isn't there an option to change the behavior to something a little more like this when passing the [ operator N vectors for an N-dimensional matrix?)
    • joran
      joran almost 13 years
      Nice question - I edited it very slightly to fix what I think is a typo (datI to dat$I). If this isn't what you meant feel free to undo...
  • Ari B. Friedman
    Ari B. Friedman almost 13 years
    +1 for finding the way that R clearly intended to do things ;-)
  • joran
    joran almost 13 years
    So if I and J are the only columns, is just mat[dat] sufficient? Or do you need to coerce to a matrix?
  • joran
    joran almost 13 years
    Seems coercion is necessary since the data frame is really a list. So you could also do as.matrix(dat).
  • IRTFM
    IRTFM almost 13 years
    @gsk3: Look at the Arguments section for ?"[" under "..." . When an array or matrix is being addressed, the matrix must have the same number of columns as the addressed object has dimensions. There are also some examples on that help page.
  • Chase
    Chase almost 13 years
    What happens if the data.frame contains index values for I and J that are outside the bounds of the matrix? I'm pretty sure it will fail...I think @Tommy's answer will return NAs for that scenario. Just something to keep in mind...
  • Johan Karlsson
    Johan Karlsson about 9 years
    Some comments would be nice and make the answer be more "attractive".
  • Skizo-ozᴉʞS
    Skizo-ozᴉʞS about 9 years
    You can not answer only a bunch of code... Come on... explain a little bit your answer :)
  • Heisenberg
    Heisenberg almost 8 years
    This indexing method is rather obscure, not covered in "Intro to R" tutorials. I got curious and read the docs, which does cover it