Match dataframe rows according to two variables (Indexing)

10,165

Solution 1

Sounds like you need to merge the two data-frames. Try this:

merge(rec, book, by = c('MRN', 'COURSE'))

and do read the help for merge (by doing ?merge at the R console) for more options on how to merge these.

Solution 2

I've created a simple example that may help you. In my case i wanted to paste the 'value' column from df1 in each row of df2, according to variables x1 and x2:

df1 <- read.table(textConnection("
x1 x2 value
1 2 12
1 3 56
2 1 35
2 2 68
"),header=T)

df2 <- read.table(textConnection("
test x1 x2
1 1 2
2 1 3
3 2 1
4 2 2
5 1 2
6 1 3
7 2 1
"),header=T)

library(sqldf)
sqldf("select df2.*, df1.value from df2 join df1 using(x1,x2)")

  test x1 x2 value
1    1  1  2    12
2    2  1  3    56
3    3  2  1    35
4    4  2  2    68
5    5  1  2    12
6    6  1  3    56
7    7  2  1    35
Share:
10,165
Admin
Author by

Admin

Updated on June 05, 2022

Comments

  • Admin
    Admin almost 2 years

    I am essentially trying to get disorganized data into long form for linear modeling.

    I have 2 data.frames "rec" and "book"

    Each row in "book" needs to be pasted onto the end of several of the rows of "rec" according to two variables in the row: "MRN" and "COURSE" which match.

    I have tried the following and variations thereon to no avail:

    i=1
    newlist=list()
    colnames(newlist)=colnames(book)
    for ( i in 1:dim(rec)[1]) {
        mrn=as.numeric(as.vector(rec$MRN[i]));
        course=as.character(rec$COURSE[i]);
    get.vector<-as.vector(((as.numeric(as.vector(book$MRN))==mrn) & (as.character(book$COURSE)==course)))
    newlist[i]<-book[get.vector,] 
    i=i+1;
    }
    

    If anyone has any suggestions on

    1)getting this to work 2) making it more elegant (or perhaps just less clumsy)

    If I have been unclear in any way I beg your pardons.

    I do understand I haven't combined any data above, I think if I can generate a long-format data.frame I can combine them all on my own

  • G. Grothendieck
    G. Grothendieck about 13 years
    The last statement can be shortened to sqldf("select * from df2, df1 using(x1, x2)") .
  • Admin
    Admin about 13 years
    m1<-c("I", "am" ,"a" ,"prize" ,"idiot"); m2<-rbind("I", "am" ,"a" ,"prize" ,"idiot"); merge(m1, m2). Thank you all for your patience!!!