Count number of rows per group and add result to original data frame

114,477

Solution 1

Using data.table:

library(data.table)
dt = as.data.table(df)

# or coerce to data.table by reference:
# setDT(df)

dt[ , count := .N, by = .(name, type)]

For pre-data.table 1.8.2 alternative, see edit history.


Using dplyr:

library(dplyr)
df %>%
  group_by(name, type) %>%
  mutate(count = n())

Or simply:

add_count(df, name, type)

Using plyr:

plyr::ddply(df, .(name, type), transform, count = length(num))

Solution 2

You can use ave:

df$count <- ave(df$num, df[,c("name","type")], FUN=length)

Solution 3

You can do this:

> ddply(df,.(name,type),transform,count = NROW(piece))
   name  type num count
1 black chair   4     2
2 black chair   5     2
3 black  sofa  12     1
4   red plate   3     1
5   red  sofa   4     1

or perhaps more intuitively,

> ddply(df,.(name,type),transform,count = length(num))
   name  type num count
1 black chair   4     2
2 black chair   5     2
3 black  sofa  12     1
4   red plate   3     1
5   red  sofa   4     1

Solution 4

This should do your work :

df_agg <- aggregate(num~name+type,df,FUN=NROW)
names(df_agg)[3] <- "count"
df <- merge(df,df_agg,by=c('name','type'),all.x=TRUE)

Solution 5

The base R function aggregate will obtain the counts with a one-liner, but adding those counts back to the original data.frame seems to take a bit of processing.

df <- data.frame(name=c('black','black','black','red','red'),
                 type=c('chair','chair','sofa','sofa','plate'),
                 num=c(4,5,12,4,3))
df
#    name  type num
# 1 black chair   4
# 2 black chair   5
# 3 black  sofa  12
# 4   red  sofa   4
# 5   red plate   3

rows.per.group  <- aggregate(rep(1, length(paste0(df$name, df$type))),
                             by=list(df$name, df$type), sum)
rows.per.group
#   Group.1 Group.2 x
# 1   black   chair 2
# 2     red   plate 1
# 3   black    sofa 1
# 4     red    sofa 1

my.summary <- do.call(data.frame, rows.per.group)
colnames(my.summary) <- c(colnames(df)[1:2], 'rows.per.group')
my.data <- merge(df, my.summary, by = c(colnames(df)[1:2]))
my.data
#    name  type num rows.per.group
# 1 black chair   4              2
# 2 black chair   5              2
# 3 black  sofa  12              1
# 4   red plate   3              1
# 5   red  sofa   4              1
Share:
114,477
Uri Laserson
Author by

Uri Laserson

Updated on July 08, 2022

Comments

  • Uri Laserson
    Uri Laserson almost 2 years

    Say I have a data.frame object:

    df <- data.frame(name=c('black','black','black','red','red'),
                     type=c('chair','chair','sofa','sofa','plate'),
                     num=c(4,5,12,4,3))
    

    Now I want to count the number of rows (observations) of for each combination of name and type. This can be done like so:

    table(df[ , c("name","type")])
    

    or possibly also with plyr, (though I am not sure how).

    However, how do I get the results incorporated into the original data frame? So that the results will look like this:

    df
    #    name  type num count
    # 1 black chair   4     2
    # 2 black chair   5     2
    # 3 black  sofa  12     1
    # 4   red  sofa   4     1
    # 5   red plate   3     1
    

    where count now stores the results from the aggregation.

    A solution with plyr could be interesting to learn as well, though I would like to see how this is done with base R.