Calculate first difference by group in R

12,509

Solution 1

This is one way using base R

df$diff <- unlist(by(df$score , list(df$group) , function(i) c(NA,diff(i))))

or

df$diff <- ave(df$score , df$group , FUN=function(i) c(NA,diff(i)))


or using data.table - this will be more efficient for larger data.frames

library(data.table)
dt <- data.table(df)
setkey(dt,group)
dt[,diff:=c(NA,diff(score)),by=group]

Solution 2

Another approach using dplyr:

library(dplyr)

score <- c(10,30,14,20,6)
group <- c(rep(1001,2),rep(1005,3))
df <- data.frame(score,group)

df %>%
  group_by(group) %>%
  mutate(first_diff = score - lag(score))

Solution 3

Although not exactly what you are looking for, ddply within the 'plyr' package can be used ta calculate the differences by group

library(plyr)
out<-ddply(df,.(group),summarize,d1=diff(score,1))
Share:
12,509
Richard
Author by

Richard

Updated on July 16, 2022

Comments

  • Richard
    Richard almost 2 years

    I was wondering if someone could help me calculate the first difference of a score by group. I know it should be a simple process but for some reason I'm having trouble doing it..... yikes

    Here's an example data frame:

    score <- c(10,30,14,20,6)
    
    group <- c(rep(1001,2),rep(1005,3))
    
    df <- data.frame(score,group)
    
    > df 
      score group
    1    10  1001
    2    30  1001
    3    14  1005
    4    20  1005
    5     6  1005
    

    And here's the output I was looking for.

    1   NA
    2   20
    3   NA  
    4    6
    5  -14
    

    Thanks in advance.