Add an index (or counter) to a dataframe by group in R
11,383
A dplyr
solution is quite simple:
library(dplyr)
df %>% group_by(ProjectID) %>% mutate(counter = row_number(ProjectID))
# ProjectID Dist counter
#1 1 x 1
#2 1 y 2
#3 2 z 1
#4 2 x 2
#5 2 h 3
#6 1 k 3
Author by
sjgknight
Academic in Sydney, research interactions between technology and learning. I write horrible R... ( http://xkcd.com/1513/ )
Updated on June 15, 2022Comments
-
sjgknight almost 2 years
I have a df like
ProjectID Dist 1 x 1 y 2 z 2 x 2 h 3 k .... ....
I want to add a third column such that we have an incrementing counter for each ProjectID:
ProjectID Dist counter 1 x 1 1 y 2 2 z 1 2 x 2 2 h 3 1 k 3 .... ....
I've had a look at
seq
rank
and a couple of other bits particularly looking to see if I could useddply
to help:df$counter <- ddply(df,.(projectID), function(x).....? )
I think I could adapt this answer How to create a counter/numeration by group? but would prefer something using something like ddply (I can't find an equivalent of cumsum but I think that's the same principle here: Create ascending series of integers by group in Pandas ). That'd let me index occurrences in a list (and e.g. merge on this).
-
akrun about 9 years
mutate(counter=row_number())
should do it. -
sjgknight about 9 yearsThis is probably a stupid question...what's
%>%
do? (And slightly tangential, is there a way to effectively search [google] for that type of code?) -
jalapic about 9 years
%>%
is a pipe or chain operator... it works like this:mydata %>% do_something_with_it %>% do_something_else
- it simply enables you to chain together functions. -
sjgknight about 9 yearsAh great, very interesting - thanks!