Index values from a matrix using row, col indices

r indexing matrix r-faq

22,613

Solution 1

Almost. Needs to be offered to "[" as a two column matrix:

dat$matval <- mat[ cbind(dat$I, dat$J) ] # should do it.

There is a caveat: Although this also works for dataframes, they are first coerced to matrix-class and if any are non-numeric, the entire matrix becomes the "lowest denominator" class.

Solution 2

Using a matrix to index as DWin suggests is of course much cleaner, but for some strange reason doing it manually using 1-D indices is actually slightly faster:

# Huge sample data
mat <- matrix(sin(1:1e7), ncol=1000)
dat <- data.frame(I=sample.int(nrow(mat), 1e7, rep=T), 
                  J=sample.int(ncol(mat), 1e7, rep=T))

system.time( x <- mat[cbind(dat$I, dat$J)] )     # 0.51 seconds
system.time( mat[dat$I + (dat$J-1L)*nrow(mat)] ) # 0.44 seconds

The dat$I + (dat$J-1L)*nrow(m) part turns the 2-D indices into 1-D ones. The 1L is the way to specify an integer instead of a double value. This avoids some coercions.

...I also tried gsk3's apply-based solution. It's almost 500x slower though:

system.time( apply( dat, 1, function(x,mat) mat[ x[1], x[2] ], mat=mat ) ) # 212

Solution 3

Here's a one-liner using apply's row-based operations

> dat <- as.data.frame(matrix(rep(seq(4),4),ncol=2))
> colnames(dat) <- c('I','J')
> dat
   I  J
1  1  1
2  2  2
3  3  3
4  4  4
5  1  1
6  2  2
7  3  3
8  4  4
> mat <- matrix(seq(16),ncol=4)
> mat
     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

> dat$K <- apply( dat, 1, function(x,mat) mat[ x[1], x[2] ], mat=mat )
> dat
  I J  K
1 1 1  1
2 2 2  6
3 3 3 11
4 4 4 16
5 1 1  1
6 2 2  6
7 3 3 11
8 4 4 16

22,613

Mike T

Hydrogeologist, numerical modeller and GIS professional. My main programming languages that I use are Python, R, SQL. I dabble with Fortran and C/C++/C# on occasions. Thanks to anyone that has helped me!

Updated on September 06, 2020

Comments

Mike T over 3 years
I have a 2D matrix mat with 500 rows × 335 columns, and a data.frame dat with 120425 rows. The data.frame dat has two columns I and J, which are integers to index the row, column from mat. I would like to add the values from mat to the rows of dat.

Here is my conceptual fail:
```
> dat$matval <- mat[dat$I, dat$J]
Error: cannot allocate vector of length 1617278737
```
(I am using R 2.13.1 on Win32). Digging a bit deeper, I see that I'm misusing matrix indexing, as it appears that I'm only getting a sub-matrix of mat, and not a single-dimension array of values as I expected, i.e.:
```
> str(mat[dat$I[1:100], dat$J[1:100]])
 int [1:100, 1:100] 20 1 1 1 20 1 1 1 1 1 ...
```
I was expecting something like int [1:100] 20 1 1 1 20 1 1 1 1 1 .... What is the correct way to index a 2D matrix using indices of row, column to get the values?
- Ari B. Friedman almost 13 years
  
  +1 for an interesting question (which begs another question: why isn't there an option to change the behavior to something a little more like this when passing the [ operator N vectors for an N-dimensional matrix?)
- joran almost 13 years
  
  Nice question - I edited it very slightly to fix what I think is a typo (datI to dat$I). If this isn't what you meant feel free to undo...
Ari B. Friedman almost 13 years

+1 for finding the way that R clearly intended to do things ;-)
joran almost 13 years

So if I and J are the only columns, is just mat[dat] sufficient? Or do you need to coerce to a matrix?
joran almost 13 years

Seems coercion is necessary since the data frame is really a list. So you could also do as.matrix(dat).
IRTFM almost 13 years

@gsk3: Look at the Arguments section for ?"[" under "..." . When an array or matrix is being addressed, the matrix must have the same number of columns as the addressed object has dimensions. There are also some examples on that help page.
Chase almost 13 years

What happens if the data.frame contains index values for I and J that are outside the bounds of the matrix? I'm pretty sure it will fail...I think @Tommy's answer will return NAs for that scenario. Just something to keep in mind...
Johan Karlsson about 9 years

Some comments would be nice and make the answer be more "attractive".
Skizo-ozᴉʞS about 9 years

You can not answer only a bunch of code... Come on... explain a little bit your answer :)
Heisenberg almost 8 years

This indexing method is rather obscure, not covered in "Intro to R" tutorials. I got curious and read the docs, which does cover it