Count number of occurences for each unique value

534,772

Solution 1

Perhaps table is what you are after?

dummyData = rep(c(1,2, 2, 2), 25)

table(dummyData)
# dummyData
#  1  2 
# 25 75

## or another presentation of the same data
as.data.frame(table(dummyData))
#    dummyData Freq
#  1         1   25
#  2         2   75

Solution 2

If you have multiple factors (= a multi-dimensional data frame), you can use the dplyr package to count unique values in each combination of factors:

library("dplyr")
data %>% group_by(factor1, factor2) %>% summarize(count=n())

It uses the pipe operator %>% to chain method calls on the data frame data.

Solution 3

It is a one-line approach by using aggregate.

> aggregate(data.frame(count = v), list(value = v), length)

  value count
1     1    25
2     2    75

Solution 4

table() function is a good way to go, as Chase suggested. If you are analyzing a large dataset, an alternative way is to use .N function in datatable package.

Make sure you installed the data table package by

install.packages("data.table")

Code:

# Import the data.table package
library(data.table)

# Generate a data table object, which draws a number 10^7 times  
# from 1 to 10 with replacement
DT<-data.table(x=sample(1:10,1E7,TRUE))

# Count Frequency of each factor level
DT[,.N,by=x]

Solution 5

length(unique(df$col)) is the most simple way I can see.

Share:
534,772

Related videos on Youtube

gakera
Author by

gakera

Updated on September 23, 2020

Comments

  • gakera
    gakera almost 4 years

    Let's say I have:

    v = rep(c(1,2, 2, 2), 25)
    

    Now, I want to count the number of times each unique value appears. unique(v) returns what the unique values are, but not how many they are.

    > unique(v)
    [1] 1 2
    

    I want something that gives me

    length(v[v==1])
    [1] 25
    length(v[v==2])
    [1] 75
    

    but as a more general one-liner :) Something close (but not quite) like this:

    #<doesn't work right> length(v[v==unique(v)])
    
  • gakera
    gakera over 13 years
    Ah, yes, I can use this, with some slight modification: t(as.data.frame(table(v))[,2]) is exactly what I need, thank you
  • Brian Diggs
    Brian Diggs about 11 years
    or ddply(data_frame, .(v), count). Also worth making it explicit that you need a library("plyr") call to make ddply work.
  • Museful
    Museful almost 11 years
    I used to do this awkwardly with hist. table seems quite a bit slower than hist. I wonder why. Can anyone confirm?
  • Torvon
    Torvon over 9 years
    Chase, any chance to order by frequency? I have the exact same problem, but my table has roughly 20000 entries and I'd like to know how frequent the most common entries are.
  • Chase
    Chase over 9 years
    @Torvon - sure, just use order() on the results. i.e. x <- as.data.frame(table(dummyData)); x[order(x$Freq, decreasing = TRUE), ]
  • Gregor Thomas
    Gregor Thomas almost 9 years
    Seems strange to use transform instead of mutate when using plyr.
  • Deep North
    Deep North over 6 years
    This method is not good, it is only fit for very few data with a lot of repeated, it will not fit a lot of continous data with few duplicated records.
  • Peter
    Peter over 5 years
    To count the number of levels you may also use lapply(DF, function(x) length(table(x)))
  • gakera
    gakera almost 4 years
    R has probably evolved a lot in the last 10 years, since I asked this question.
  • David
    David almost 4 years
    Alternatively, and a bit shorter: data %>% count(factor1, factor2)
  • Martin
    Martin over 3 years
    One-liner indeed instead of using unique() + something else. Wonderful!
  • dsg38
    dsg38 over 2 years
    NB: This doesn't include the NA values
  • vonjd
    vonjd about 2 years
    aggregate is underappreciated!